ICML2025

A General Framework for Inference-time Scaling and Steering of Diffusion Models

Raghav Singhal, Zachary Horvitz, Ryan Teehan, Mengye Ren, Zhou Yu, Kathleen McKeown, Rajesh Ranganath

摘要

Diffusion models have demonstrated remarkable performance in generative modeling, but generating samples with specific desiderata remains challenging. Existing solutions -such as finetuning, best-of-n sampling, and gradient-based guidance -are expensive, inefficient, or limited in applicability. In this work, we introduce Feynman-Kac (FK) steering, which applies Feynman-Kac interacting particle systems to the inference-time steering of diffusion models with arbitrary reward functions. FK steering works by generating multiple trajectories, called particles, and resampling particles at intermediate steps based on scores computed using functions called potentials. Potentials are defined using rewards for intermediate states and are chosen such that a high score indicates the particle will yield a highreward sample. We explore various choices of potentials, rewards, and samplers. Steering textto-image models with a human preference reward, we find that FK steering outperforms finetuned models with just 2 particles. Moreover, FK steering a 0.8B parameter model outperforms a 2.6B model, achieving state-of-the-art performance on prompt fidelity. We also steer text diffusion models with rewards for text quality and rare attributes such as toxicity, and find that FK steering generates lower perplexity text and enables gradient-free control. Overall, inferencetime scaling and steering of diffusion models, even training-free, provides significant quality and controllability benefits. Code available here.