CVPR2021

One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing

Ting-Chun Wang, Arun Mallya, Ming-Yu Liu

Abstract

NVIDIA Corporation (a) Original video (b) Compressed videos at the same bit-rate (c) Our re-rendered novel-view results Figure 1: Our method can re-create a talking-head video using only a single source image (e.g., the first frame) and a sequence of unsupervisedly-learned 3D keypoints, representing motions in the video. Our novel keypoint representation provides a compact representation of the video that is 10ˆmore efficient than the H.264 baseline can provide. A novel 3D keypoint decomposition scheme allows re-rendering the talking-head video under different poses, simulating often missed face-to-face video conferencing experiences. Video versions of the paper figures and additional results are available at our project page, https://nvlabs.github.io/face-vid2vid .