CVPR2025

Universal Scene Graph Generation

Shengqiong Wu, Hao Fei, Tat-Seng Chua

Abstract

associator to relieve the modality gap for cross-modal object alignment. Further, we propose a text-centric scene contrasting learning mechanism to mitigate domain imbalances by aligning multimodal objects and relations with textual SGs. Through extensive experiments, we demonstrate that USG offers a stronger capability for expressing scene semantics than standalone SGs, and also that our USG-Par achieves higher efficacy and performance. The project page is https://sqwu.top/USG/ .