ICLR2026

Any-step Generation via N-th Order Recursive Consistent Velocity Field Estimation

Peng Sun, Tao Lin

Abstract

Recent advances in few-step generative models (typically 11-88 steps), such as consistency models, have yielded impressive performance. However, their broader adoption is hindered by significant challenges, including substantial computational overhead, the reliance on complex multi-component loss functions, and intricate multi-stage training strategies that lack end-to-end simplicity. These limitations impede their scalability and stability, especially when applied to large-scale models.

To address these issues, we introduce NN-th order Recursive Consistent velocity field estimation for Generative Modeling (RCGM), a novel framework that unifies many existing approaches. Within this framework, we reveal that conventional one-step methods, such as consistency and MeanFlow models, are special cases of 1st-order RCGM. This insight enables a natural extension to higher-order scenarios (N2N \geq 2), which exhibit markedly improved training stability and achieve state-of-the-art (SOTA) performance.

For instance, on ImageNet 256×256256\times256, RCGM enables a 675M675\text{M} parameter diffusion transformer to achieve a 1.481.48 FID score in just 22 sampling steps. Crucially, RCGM facilitates the stable full-parameter training of a large-scale (20B20\textrm{B}) unified multi-modal model, attaining a 0.860.86 GenEval score in 22 steps. In contrast, conventional 1st-order approaches, such as consistency and MeanFlow models, typically suffer from training instability, model collapse, or memory constraints under comparable settings.

Code is available at: https://github.com/LINs-lab/RCGM.