CVPR2023
Visual Prompt Tuning for Generative Transfer Learning
Kihyuk Sohn, Huiwen Chang, José Lezama, Luisa Polania, Han Zhang, Yuan Hao, Irfan Essa, Lu Jiang
Abstract
Figure 1. Image synthesis by knowledge transfer. Unlike previous works using GANs as base model and test transfer on relatively narrow visual domains, we transfer knowledge of generative vision transformers [7, 15] to a wide range of visual domains, including natural (e.g., scene, flower), specialized (e.g., satellite, medical), and structured (e.g., road scenes, infograph, sketch) with a few training images. Notably, the prompt tuning significantly improves the prior best FID on two benchmarks ImageNet (85.9!16.3) and Places (71.3!24.2).