CVPR2025

CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians

Chongjian Ge, Chenfeng Xu, Yuanfeng Ji, Chensheng Peng, Masayoshi Tomizuka, Ping Luo, Mingyu Ding, Varun Jampani, Wei Zhan

DOI 出版方

摘要

Adobe Research 4 UNC-Chapel Hill 5 Stability AI an owl an owl perches on a branch an owl perches on a branch near a pinecone an owl perches on a branch near a pinecone, with a rat below the branch Single 3D Generation Compositional 3D Generation Rendered Images Gaussians Initialization with 2D Compositionality 1K iterations 4K iterations "an owl perches on a branch near a pinecone" a footballer is kicking a soccer ball a bird is drinking water from a cup a photographer is capturing a beautiful butterfly with camera a beautiful butterfly … Figure 1. Illustration of compositional 3D Generation and COMPGS. All the contents are generated by COMPGS. Top row: COMPGS is capable of generating either a single object (e.g., a butterfly) or generating compositional objects with reasonable interactions (e.g., the rightmost figure in the top row). Middle row: Beyond text-to-3D generation, COMPGS can be easily extend to 3D editing by progressively adding objects. The colored texts (e.g., 'a branch', 'a pinecone', 'a rat' in the rightmost figure) denote the added part compared to its previous asset. Bottom row: COMPGS achieves compositional text-to-3D by transferring 2D compositionality to initialize 3D Gaussians. COMPGS is further trained with dynamic SDS optimization to produce plausible results.