CVPR2025

FirePlace: Geometric Refinements of LLM Common Sense Reasoning for 3D Object Placement

Ian Huang, Yanan Bao, Karen Truong, Howard Zhou, Cordelia Schmid, Leonidas J. Guibas, Alireza Fathi

摘要

Figure 1 . FirePlace enables multi-modal large language models (MLLMs)to place new 3D objects into complex, preexisting 3D scenes, given (1) a 3D scene, (2) a 3D object, and (3) a language prompt. It uses a combination of MLLM common sense and low-level geometry constraints through the process described in this paper. Object placements generated by FirePlace are shown in Red.