CVPR2025

Learning Visual Generative Priors without Text

Shuailei Ma, Kecheng Zheng, Ying Wei, Wei Wu, Fan Lu, Yifei Zhang, Chen-Wei Xie, Biao Gong, Jiapeng Zhu, Yujun Shen

摘要

4 Alibaba Group 5 HKUST https://ant-research.github.io/lumos (a) Text-to-Image Generation (b) Novel View Synthesis (c) Image-to-Video Generation Figure 1. Diverse downstream tasks of Lumos including (a) text-to-image generation, (b) novel view synthesis (left: input view, middle: random novel views, right: reconstruction Gaussian) and (c) image-to-video generation.