ICLR2025

On the Feature Learning in Diffusion Models

Andi Han, Wei Huang, Yuan Cao, Difan Zou

Abstract

The predominant success of diffusion models in generative modeling has spurred significant interest in understanding their theoretical foundations. In this work, we propose a feature learning framework aimed at analyzing and comparing the training dynamics of diffusion models with those of traditional classification models. Our theoretical analysis demonstrates that diffusion models, due to the denoising objective, are encouraged to learn more balanced and comprehensive representations of the data. In contrast, neural networks with a similar architecture trained for classification tend to prioritize learning specific patterns in the data, often focusing on easy-to-learn components. To support these theoretical insights, we conduct several experiments on both synthetic and real-world datasets, which empirically validate our findings and highlight the distinct feature learning dynamics in diffusion models compared to classification. PROBLEM SETTING This section introduces the problem settings for both diffusion model and classification, including the data model, neural network functions as well as training objectives and algorithm. Definition 2.1 (Data distribution). Each data sample consists of two patches, as x = [x (1)⊤ , x (2)⊤ ] ⊤ , where each patch is generated as follows: • Sample y ∈ -1, 1 uniformly with P(y = -1) = P(y = 1) = 1/2. Published as a conference paper at ICLR 2025 • Given two orthogonal signal vectors µ 1 , µ -1 , with µ 1 ⊥ µ -1 , we set x (1) = µ y , i.e., x (1) = µ 1 if y = 1 and x (1) = µ -1 if y = -1. For simplicity, we assume ∥µ 1 ∥ = ∥µ -1 ∥ = ∥µ∥. This multi-patch data model reflects the structure of image data, where each image consists of multiple patches, and only a subset of the patches are relevant to the class label, while the rest contribute as background noise. This data model has been employed in several existing studies (