CVPR2024

Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance

Kelvin C. K. Chan, Yang Zhao, Xuhui Jia, Ming-Hsuan Yang, Huisheng Wang

摘要

style w/ SAG S*, swimming in front of Eiffel Tower, Van Gogh starry night style w/o SAG S* in a basket, at a beach S* with a cloudy night sky and a moon S*, Pixar movie Reference Figure 1. Addressing Content Ignorance. Given user-provided subject images, a part of the content specified in the text prompt (highlighted in blue) are overlooked. Our Subject-Agnostic Guidance (SAG) aligns the output more closely with both the target subject and text prompt. Here S * denotes a pseudo-word, with its text embedding replaced by a learnable subject embedding.