ICML2021

A Scalable Second Order Method for Ill-Conditioned Matrix Completion from Few Samples

Christian Kümmerle, Claudio Mayrink Verdun

25 citations

Abstract

A . We propose an iterative algorithm for low-rank matrix completion that can be interpreted as an iteratively reweighted least squares (IRLS) algorithm, a saddle-escaping smoothing Newton method or a variable metric proximal gradient method applied to a non-convex rank surrogate. It combines the favorable data-efficiency of previous IRLS approaches with an improved scalability by several orders of magnitude. We establish the first local convergence guarantee from a minimal number of samples for that class of algorithms, showing that the method attains a local quadratic convergence rate. Furthermore, we show that the linear systems to be solved are well-conditioned even for very ill-conditioned ground truth matrices. We provide extensive experiments, indicating that unlike many state-of-the-art approaches, our method is able to complete very ill-conditioned matrices with a condition number of up to 10 10 from few samples, while being competitive in its scalability. . I In different areas of machine learning and signal processing, low-rank models have turned out to be a powerful tool for the acquisition, storage and computation of information. In many of these applications, an important sub-problem is to infer the low-rank model from partial or incomplete data [DR , CLC ]. This problem is called low-rank matrix completion: Given a matrix X 0 ∈ ℝ 𝑑 1 ×𝑑 2 of rank-𝑟 and an index set Ω ⊂ [𝑑 1 ] × [𝑑 2 ], the task is to reconstruct X 0 just from the knowledge of Ω and 𝑃 Ω (X 0 ), where 𝑃 Ω : ℝ 𝑑 1 ×𝑑 2 → ℝ 𝑚 is the subsampling operator that maps a matrix to the set of entries indexed by Ω. It is well-known that this can be reformulated [RFP ] as the NP-hard rank minimization problem ( ) min From an optimization point of view, ( ) is particularly difficult to handle due to two properties: its non-convexity and its non-smoothness. A widely studied approach in the literature replaces the rank(X) by the (convex) nuclear norm X * = 𝑑 𝑖=1 𝜎 𝑖 (X) [FHB ], which is the tightest convex envelope of the rank, as an objective. For this approach, a mature theory has been developed that includes performance guarantees for a near-optimal sample complexity [CT , Che ] and robustness to noise [CP , CCF + ]. However, from a practical point of view, using such a convex relaxation to find a low-rank completion is computationally very demanding, as even first-order solvers have an per-iteration arithmetic complexity that is at least cubic in the dimensions of X 0 [CLC ]. Thus, convex relaxations are of little use in large-scale applications of the model such as in recommender systems [KBV ], where even storing the dense matrix X 0 ∈ ℝ 𝑑 1 ×𝑑 2 is prohibitive. Another