CVPR2025

AMO Sampler: Enhancing Text Rendering with Overshooting

Xixi Hu, Keyang Xu, Bo Liu, Qiang Liu, Hongliang Fei

摘要

Achieving precise alignment between textual instructions and generated images in text-to-image generation is a significant challenge, particularly in rendering written text within images. Sate-of-the-art models like Stable Diffusion 3 (SD3), Flux, and AuraFlow still struggle with accurate text depiction, resulting in misspelled or inconsistent text. We introduce a training-free method with minimal computational overhead that significantly enhances text rendering quality. Specifically, we introduce an overshooting sampler for pretrained rectified flow (RF) models, by alternating between over-simulating the learned ordinary differential equation ⇤ Equal contribution.