CVPR2025

LayoutVLM: Differentiable Optimization of 3D Layout via Vision-Language Models

Fan-Yun Sun, Weiyu Liu, Siyi Gu, Dylan Lim, Goutam Bhat, Federico Tombari, Manling Li, Nick Haber, Jiajun Wu

Abstract

Holodeck no Òcentral reading areaÓ LayoutVLM assets that fail to satisfy constraints From unlabeled 3D assets and language instruction, LAYOUTVLM generates scene layouts that are physically plausible and semantically coherent-two criteria that existing methods often struggle to meet. Our approach addresses this by using a VLM to generate a scene layout representation that defines both an initial layout and spatial relations between assets for differentiable optimization.