CVPR2025

Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation

Jiantao Lin, Xin Yang, Meixi Chen, Yingjie Xu, Dongyu Yan, Leyi Wu, Xinli Xu, Lie Xu, Shunsi Zhang, Ying-Cong Chen

Abstract

Wow! They really can't see me under my invisible cloak Morning Hermione! What are you doing? Hi, Ron! I'm practicing my new spell Woofwoof! "Kiss3DGen!" "A magic-ball!" and "A candle" "A Potter Shiba dog!" "A red sofa" and "An owl" "A magic book!" Figure 1 . A 3D Harry Potter scene built with Kiss3DGen. Our proposed framework, KISS3DGen, is a unified 3D generation framework that facilitates various 3D generation tasks, including text-to-3D, image-to-3D, 3D enhancement, editing and more. Specifically, most of the assets in the figure is generated from text (captioned with abbreviated text prompts) or image (marked by dash lines) conditions, while the main characters (Hermoine, Ron and Potter) are created using a hybrid pipeline that combines image-to-3D and text-guided mesh editing. Please zoom in for details and refer to our main paper for a more introduction.