ICLR2025

UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting

Haoyuan Li, Yanpeng Zhou, Tao Tang, Jifei Song, Yihan Zeng, Michael Kampffmeyer, Hang Xu, Xiaodan Liang

Abstract

Figure 1: Left: information gap in different 3D representations. Middle: our UniGS, a novel unified text-image-3D pre-training framework, leverages 3DGS as the 3D representation. Right: our UniGS learns a more general and stronger multi-modal representation.