CVPR2025

MESC-3D: Mining Effective Semantic Cues for 3D Reconstruction from a Single Image

Shaoming Li, Qing Cai, Songqi Kong, Runqing Tan, Heng Tong, Shiji Qiu, Yongguo Jiang, Zhi Liu

摘要

Figure 1. (a) Previous methods simply performed basic operations on the extracted 2D image information and 3D point cloud without establishing a connection between them. (b) Compared to that, we introduced two key designs: First, the Effective Semantic Mining Module, which effectively mines semantic information from the entangled features and enables point cloud to select the information. Second, the 3D Semantic Prior Learning Module, which aims to enable the model to interpret 3D structures as humans do in 3D reconstruction from a single image. (c) The generalization comparsion between the proposed MESC-3D and SOTA methods on base classes with complex backgrounds. (d) MESC-3D's zero-shot on unseen classes. (e) Comparison with state-of-the-art methods on ShapeNet [1] Dataset on Chamfer Distance (y-axis), parameter count (size of the area), and inference time (x-aixs) which show that MESC-3D achieve the best performance with comparable computational cost.