CVPR2021

LASR: Learning Articulated Shape Reconstruction From a Monocular Video

Gengshan Yang, Deqing Sun, Varun Jampani, Daniel Vlasic, Forrester Cole, Huiwen Chang, Deva Ramanan, William T. Freeman, Ce Liu

Abstract

Google Research t=4 t=8 t=12 t=16 LASR (Ours) PIFuHD SMPLify-X VIBE LASR (Ours) UMR-horse 0 °60 °A-CSM (camel template) SMALify horse t=20 t=40 t=60 t=80 Figure 1 . Top: Sample input video frames and articulated shapes recovered by our method (LASR). Bottom: Comparison with existing methods, where the input to each method (either video or image) is denoted at the top left, and the shape template being used is denoted at the bottom right of each result. Many existing approaches on nonrigid shape reconstruction heavily rely on category-specific 3D shape templates, such as SMPL for human [33, 35] and SMAL for quadrupeds [6, 58] . In contrast, LASR jointly recovers the object shape, articulation, and camera parameters from a monocular video without using category-specific shape templates. By relying on generic shape and motion priors, LASR applies to a wider range of nonrigid shapes and yields high-fidelity 3D reconstructions: It recovers both humps of the camel, which are missing from other methods. It also recovers the silk ribbon of the dancer (as denoted by the blue box), which confuses SMPLify-X and VIBE as the right arm. Please refer to video results on our project page.