ICLR2025
Heavy-Tailed Diffusion Models
Kushagra Pandey, Jaideep Pathak, Yilun Xu, Stephan Mandt, Michael S. Pritchard, Arash Vahdat, Morteza Mardani
摘要
Diffusion models achieve state-of-the-art generation quality across many applications, but their ability to capture rare or extreme events in heavy-tailed distributions remains unclear. In this work, we show that traditional diffusion and flow-matching models with standard Gaussian priors fail to capture heavy-tailed behavior. We address this by repurposing the diffusion framework for heavy-tail estimation using multivariate Student-t distributions. We develop a tailored perturbation kernel and derive the denoising posterior based on the conditional Student-t distribution for the backward process. Inspired by γ-divergence for heavy-tailed distributions, we derive a training objective for heavy-tailed denoisers. The resulting framework introduces controllable tail generation using only a single scalar hyperparameter, making it easily tunable for diverse real-world distributions. As specific instantiations of our framework, we introduce t-EDM and t-Flow, extensions of existing diffusion and flow models that employ a Student-t prior. Remarkably, our approach is readily compatible with standard Gaussian diffusion models and requires only minimal code changes. Empirically, we show that our t-EDM and t-Flow outperform standard diffusion models in heavy-tail estimation on high-resolution weather datasets in which generating rare and extreme events is crucial. We extend widely adopted diffusion models, such as EDM (Karras et al., 2022 ) and straight-line flows (Lipman et al., 2023; Liu et al., 2022) , by introducing their Student-t counterparts: t-EDM and t-Flow. We derive the corresponding SDEs and ODEs for modeling heavy-tailed distributions. Through extensive experiments on the HRRR dataset (Dowell et al., 2022) , we train both unconditional and conditional versions of these models. The results show that standard EDM struggles to capture tails and extreme events, whereas t-EDM performs significantly better in modeling such phenomena. To summarize, we present, • Heavy-tailed Diffusion Models. We repurpose the diffusion model framework for heavy-tail estimation by formulating both the forward and reverse processes using multivariate Student-t distributions. The denoiser is learned by minimizing the γ-power divergence (Kim et al., 2024) between the forward and reverse posteriors. • Continuous Counterparts. We derive continuous formulations for heavy-tailed diffusion models and provide a principled approach to constructing ODE and SDE samplers. This enables the instantiation of t-EDM and t-Flow as heavy-tailed alternatives to standard diffusion and flow models. • Empirical Results. Experiments on the HRRR dataset (Dowell et al., 2022) , a high-resolution dataset for weather modeling, show that t-EDM significantly outperforms EDM in capturing tail distributions for both unconditional and conditional tasks. • Theoretical Connections. To theoretically justify the effectiveness of our approach, we present several theoretical connections between our framework and existing work in diffusion models and robust statistical estimators (Futami et al., 2018) .