ICLR2022

Fast Regression for Structured Inputs

Raphael A. Meyer, Cameron Musco, Christopher Musco, David P. Woodruff, Samson Zhou

被引用 14 次

摘要

We study the p\ell_p regression problem, which requires finding xRd\mathbf{x}\in\mathbb R^{d} that minimizes Axbp\|\mathbf{A}\mathbf{x}-\mathbf{b}\|_p for a matrix ARn×d\mathbf{A}\in\mathbb R^{n \times d} and response vector bRn\mathbf{b}\in\mathbb R^{n}. There has been recent interest in developing subsampling methods for this problem that can outperform standard techniques when nn is very large. However, all known subsampling approaches have run time that depends exponentially on pp, typically, dO(p)d^{\mathcal{O}(p)}, which can be prohibitively expensive. We improve on this work by showing that for a large class of common structured matrices, such as combinations of low-rank matrices, sparse matrices, and Vandermonde matrices, there are subsampling based methods for p\ell_p regression that depend polynomially on pp. For example, we give an algorithm for p\ell_p regression on Vandermonde matrices that runs in time O(nlog3n+(dp2)0.5+ωpolylogn)\mathcal{O}(n\log^3 n+(dp^2)^{0.5+\omega}\cdot\text{polylog}\,n), where ω\omega is the exponent of matrix multiplication. The polynomial dependence on pp crucially allows our algorithms to extend naturally to efficient algorithms for \ell_\infty regression, via approximation of \ell_\infty by O(logn)\ell_{\mathcal{O}(\log n)}. Of practical interest, we also develop a new subsampling algorithm for p\ell_p regression for arbitrary matrices, which is simpler than previous approaches for p4p \ge 4.