CVPR2025
FirePlace: Geometric Refinements of LLM Common Sense Reasoning for 3D Object Placement
Ian Huang, Yanan Bao, Karen Truong, Howard Zhou, Cordelia Schmid, Leonidas J. Guibas, Alireza Fathi
摘要
Figure 1 . FirePlace enables multi-modal large language models (MLLMs)to place new 3D objects into complex, preexisting 3D scenes, given (1) a 3D scene, (2) a 3D object, and (3) a language prompt. It uses a combination of MLLM common sense and low-level geometry constraints through the process described in this paper. Object placements generated by FirePlace are shown in Red.