CVPR2024

EscherNet: A Generative Model for Scalable View Synthesis

Xin Kong, Shikun Liu, Xiaoyang Lyu, Marwan Taher, Xiaojuan Qi, Andrew J. Davison

摘要

Figure 1. We introduce EscherNet, a diffusion model that can generate a flexible number of consistent target views (highlighted in blue) with arbitrary camera poses, based on a flexible number of reference views (highlighted in purple). EscherNet demonstrates remarkable precision in camera control and robust generalisation across synthetic and real-world images featuring multiple objects and rich textures.