ICLR2026

Consistent Low-Rank Approximation

David Woodruff, Samson Zhou

62 citations

Abstract

We introduce and study the problem of consistent low-rank approximation, in which rows of an input matrix ARn×d\mathbf{A}\in\mathbb{R}^{n\times d} arrive sequentially and the goal is to provide a sequence of subspaces that well-approximate the optimal rank-kk approximation to the submatrix A(t)\mathbf{A}^{(t)} that has arrived at each time tt, while minimizing the recourse, i.e., the overall change in the sequence of solutions. We first show that when the goal is to achieve a low-rank cost within an additive εA(t)F2\varepsilon\cdot||\mathbf{A}^{(t)}||_F^2 factor of the optimal cost, roughly O(kεlog(nd))\mathcal{O}\left(\frac{k}{\varepsilon}\log(nd)\right) recourse is feasible. For the more challenging goal of achieving a relative (1+ε)(1+\varepsilon)-multiplicative approximation of the optimal rank-kk cost, we show that a simple upper bound in this setting is k2ε2polylog(nd)\frac{k^2}{\varepsilon^2}\cdot\text{poly}\log(nd) recourse, which we further improve to k3/2ε2polylog(nd)\frac{k^{3/2}}{\varepsilon^2}\cdot\text{poly}\log(nd) for integer-bounded matrices and kε2polylog(nd)\frac{k}{\varepsilon^2}\cdot\text{poly}\log(nd) for data streams with polynomial online condition number. We also show that Ω(kεlognk)\Omega\left(\frac{k}{\varepsilon}\log\frac{n}{k}\right) recourse is necessary for any algorithm that maintains a multiplicative (1+ε)(1+\varepsilon)-approximation to the optimal low-rank cost, even if the full input is known in advance. Finally, we perform a number of empirical evaluations to complement our theoretical guarantees, demonstrating the efficacy of our algorithms in practice.