ICLR2025
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects
Jiayi Liu, Denys Iliash, Angel X. Chang, Manolis Savva, Ali Mahdavi Amiri
Abstract
We address the challenge of creating 3D assets for household articulated objects from a single image. Prior work on articulated object creation either requires multi-view multi-state input, or only allows coarse control over the generation process. These limitations hinder the scalability and practicality for articulated object modeling. In this work, we propose a method to generate articulated objects from a single image. Observing the object in resting state from an arbitrary view, our method generates an articulated object that is visually consistent with the input image. To capture the ambiguity in part shape and motion posed by a single view of the object, we design a diffusion model that learns the plausible variations of objects in terms of geometry and kinematics. To tackle the complexity of generating structured data with attributes in multiple domains, we design a pipeline that produces articulated objects from high-level structure to geometric details in a coarse-to-fine manner, where we use a part connectivity graph and part abstraction as proxies. Our experiments show that our method outperforms the state-of-theart in articulated object creation by a large margin in terms of the generated object realism, resemblance to the input image, and reconstruction quality. RELATED WORK Generation of structured data. Our task is closely related to the generation of structured data (Chaudhuri et al., 2020) . The generation of 3D shapes with semantic parts is a widely studied problem with the main goal of modeling objects with geometric details and semantic grouping at the part level. Prior work either synthesizes objects in voxels with semantic labels (Wang et al., 2018; Li et al., 2020; Wu et al., 2020) or further considers the spatial structure, such as symmetry and support relationship, by jointly learning in the latent space (Wu et al., 2019) or explicitly modeling