NeurIPS2022

NeuPhysics: Editable Neural Geometry and Physics from Monocular Videos

Yi-Ling Qiao, Alexander Gao, Ming C. Lin

58 citations

Abstract

We present a method for learning 3D geometry and physics parameters of a dynamic scene from only a monocular RGB video input. To decouple the learning of underlying scene geometry from dynamic motion, we represent the scene as a time-invariant signed distance function (SDF) which serves as a reference frame, along with a time-conditioned deformation field. We further bridge this neural geometry representation with a differentiable physics simulator by designing a twoway conversion between the neural field and its corresponding hexahedral mesh, enabling us to estimate physics parameters from the source video by minimizing a cycle consistency loss. Our method also allows a user to interactively edit 3D objects from the source video by modifying the recovered hexahedral mesh, and propagating the operation back to the neural field representation. Experiments show that our method achieves superior mesh and video reconstruction of dynamic scenes compared to competing Neural Field approaches, and we provide extensive examples which demonstrate its ability to extract useful 3D representations from videos captured with consumer-grade cameras. Recent work has extended NeRF to videos of dynamic scenes [46, 1, 72] , but they primarily focus on 2D image synthesis, not reconstructing 3D geometry. These works either directly condition the neural fields on time [13, 16, 63] or they learn a canonical field separately from time-dependent motion [45, 47, 57, 5] . We choose the latter strategy because it better constrains the learning problem