ICLR2025

Enhancing Compositional Text-to-Image Generation with Reliable Random Seeds

Shuangqi Li, Hieu Le, Jingyi Xu, Mathieu Salzmann

Abstract

Four unicorns " " Two dogs " " A dove on top of a basketball " " A penguin on the right of a bowl " Stable Diffusion Ours Figure 1: Example images generated by Stable Diffusion 2.1 and ours. Existing text-to-image diffusion models are prone to making mistakes at numeracy and spatial relations.