ICML2021

Understanding the Dynamics of Gradient Flow in Overparameterized Linear models

Salma Tarmoun, Guilherme França, Benjamin D. Haeffele, René Vidal

被引用 76 次

摘要

We provide a detailed analysis of the dynamics of the gradient flow in overparameterized two-layer linear models. A particularly interesting feature of this model is that its nonlinear dynamics can be exactly solved as a consequence of a large number of conservation laws that constrain the system to follow particular trajectories. More precisely, the gradient flow preserves the difference of the Gramian matrices of the input and output weights, and its convergence to equilibrium depends on both the magnitude of that difference (which is fixed at initialization) and the spectrum of the data. In addition, and generalizing prior work, we prove our results without assuming small, balanced or spectral initialization for the weights. Moreover, we establish interesting mathematical connections between matrix factorization problems and differential equations of the Riccati type.