ICLR2026
Overshoot and Shrinkage in Classifier-Free Guidance: From Theory to Practice
Krunoslav Lehman Pavasovic, Jakob Verbeek, Giulio Biroli, Marc Mezard
摘要
Classifier-Free Guidance (CFG) is widely used in diffusion and flow-based generative models for high-quality conditional generation, yet its theoretical properties remain incompletely understood. By connecting CFG to the high-dimensional framework of diffusion regimes, we show that in sufficiently high dimensions it reproduces the correct target distribution-a "blessing-of-dimensionality" result. Leveraging this theoretical framework, we analyze how the well-known artifacts of mean overshoot and variance shrinkage emerge in lower dimensions, characterizing how they become more pronounced as dimensionality decreases. Building on these insights, we propose a simple nonlinear extension of CFG, proving that it mitigates both effects while preserving CFG's practical benefits. Finally, we validate our approach through numerical simulations on Gaussian mixtures and real-world experiments on diffusion and flow-matching state-of-the-art classconditional and text-to-image models, demonstrating continuous improvements in sample quality, diversity, and consistency. Power-law CFG Stand. CFG No CFG Figure 1: Qualitative comparison of unguided sampling, standard Classifier-Free Guidance (CFG), and our proposed non-linear power-law CFG (DiT/XL-2 on ImageNet-1K 256 × 256). Standard CFG increases fidelity at a substantial expense to diversity and semantic meaning compared to unguided CFG. Our power-law guidance improves fidelity at no cost to semantics or diversity. Each column sample starts from the same seed.