ICLR2025

Shallow diffusion networks provably learn hidden low-dimensional structure

Nicholas Matthew Boffi, Arthur Jacot, Stephen Tu, Ingvar M. Ziemann

1 citation

Abstract

Diffusion-based generative models provide a powerful framework for learning to sample from a complex target distribution. The remarkable empirical success of these models applied to high-dimensional signals, including images and video, stands in stark contrast to classical results highlighting the curse of dimensionality for distribution recovery. In this work, we take a step towards understanding this gap through a careful analysis of learning diffusion models over the Barron space of single layer neural networks. In particular, we show that these shallow models provably adapt to simple forms of low dimensional structure, thereby avoiding the curse of dimensionality. We combine our results with recent analyses of sampling with diffusion models to provide an end-to-end sample complexity bound for learning to sample from structured distributions. Importantly, our results do not require specialized architectures tailored to particular latent structures, and instead rely on the low-index structure of the Barron space to adapt to the underlying distribution. To implement this scheme in practice, the score function must be learned, and the reverse process must be discretized. Assuming access to a learned score function ŝ ≈ ∇ log p, we now consider discretizing (3.3). In this work we make use of the exponential integrator (EI), which fixes a sequence (to be specified) of reverse process timesteps 0 = τ N -1. (3.4) Recently, building off of the works by Chen et al. (2023a) and Lee et al. (2023), Benton et al. (2024) showed that it suffices to control the score approximation error in L 2 (p t ) to guarantee that the process (3.4) yields a high quality sample from p 0 . 2 Score function estimation. To estimate the score function ∇ log p t over the interval [0, T ], one would ideally minimize the least-squares objective over a model ŝ, R(ŝ) :=