ICLR2026

When Scores Learn Geometry: Rate Separations under the Manifold Hypothesis

Xiang Li, Zebang Shen, Ya-Ping Hsieh, Niao He

6 citations

Abstract

Score-based methods, such as diffusion models and Bayesian inverse problems, are often interpreted as learning the data distribution in the low-noise limit (σ0\sigma \to 0). In this work, we propose an alternative perspective: their success arises from implicitly learning the data manifold rather than the full distribution. Our claim is based on a novel analysis of scores in the small-σ\sigma regime that reveals a sharp separation of scales: information about the data manifold is Θ(σ2)\Theta(\sigma^{-2}) stronger than information about the distribution. We argue that this insight suggests a paradigm shift from the less practical goal of distributional learning to the more attainable task of geometric learning, which provably tolerates O(σ2)O(\sigma^{-2}) larger errors in score approximation. We illustrate this perspective through three consequences: i) in diffusion models, concentration on data support can be achieved with a score error of o(σ2)o(\sigma^{-2}), whereas recovering the specific data distribution requires a much stricter o(1)o(1) error; ii) more surprisingly, learning the uniform distribution on the manifold—an especially structured and useful object—is also O(σ2)O(\sigma^{-2}) easier; and iii) in Bayesian inverse problems, the maximum entropy prior is O(σ2)O(\sigma^{-2}) more robust to score errors than generic priors. Finally, we validate our theoretical findings with preliminary experiments on large-scale models, including Stable Diffusion.