CVPR2021

Scan2Cap: Context-Aware Dense Captioning in RGB-D Scans

Dave Zhenyu Chen, Ali Gholami, Matthias Nießner, Angel X. Chang

摘要

Figure 1 : We introduce the task of dense captioning in RGB-D scans with a model that can densely localize objects in a 3D scene and describe them using natural language in a single forward pass.