CVPR2023

Distilling Neural Fields for Real-Time Articulated Shape Reconstruction

Jeff Tan, Gengshan Yang, Deva Ramanan

摘要

Distill Figure 1 . By distilling knowledge from dynamic NeRFs fitted to offline video data at scale [16, 44] , we present a method to train categoryspecific real-time video shape predictors, which output temporally-consistent viewpoint, articulation, and appearance given casual input videos. Our method replaces expensive test-time optimization with a single forward pass, allowing real-time inference on a RTX-3090 GPU. Compared to existing model-based methods for reconstructing humans and animals in motion [13, 18, 31] , our method does not require pre-defined 3D templates or ground-truth 3D data to train.