NeurIPS2024
Globally Q-linear Gauss-Newton Method for Overparameterized Non-convex Matrix Sensing
Xixi Jia, Fangchen Feng, Deyu Meng, Defeng Sun
Abstract
This paper focuses on the optimization of overparameterized, non-convex low-rank matrix sensing (LRMS)—an essential component in contemporary statistics and machine learning. Recent years have witnessed significant breakthroughs in first-order methods, such as gradient descent, for tackling this non-convex optimization problem. However, the presence of numerous saddle points often prolongs the time required for gradient descent to overcome these obstacles. Moreover, overparame-terization can markedly decelerate gradient descent methods, transitioning its convergence rate from linear to sub-linear. In this paper, we introduce an approximated Gauss-Newton (AGN) method for tackling the non-convex LRMS problem. No-tably, AGN incurs a computational cost comparable to gradient descent per iteration but converges much faster without being slowed down by saddle points. We prove that, despite the non-convexity of the objective function, AGN achieves Q-linear convergence from random initialization to the global optimal solution. The global Q-linear convergence of AGN represents a substantial enhancement over the convergence of the existing methods for the overparameterized non-convex LRMS. The code for this paper is available at https://github.com/hsijiaxidian/AGN .