ICLR2025

Streaming Algorithms For ℓp Flows and ℓp Regression

Amit Chakrabarti, Jeffrey Jiang, David P. Woodruff, Taisuke Yasuda

Abstract

We initiate the study of one-pass streaming algorithms for underdetermined ℓ p linear regression problems of the form min where 𝐴 ∈ R n×d with n ≪ d , which generalizes basis pursuit (p = 1) and least squares solutions to underdetermined linear systems (p = 2). We study the column-arrival streaming model, in which the columns of 𝐴 and then the vector 𝑏 are presented one by one in a stream. When 𝐴 is the incidence matrix of a graph, this corresponds to an edge insertion graph stream, and the regression problem captures ℓ p flows which includes transshipment (p = 1), electrical flows (p = 2), and max flow (p = ∞) on undirected graphs as special cases. Our goal is to design algorithms which use space much less than the entire stream, which has a length of d. For the task of estimating the cost of the ℓ p regression problem for p ∈ [2, ∞], we show a streaming algorithm which constructs a sparse instance supported on Õ(ε -2 n) columns of 𝐴 which approximates the cost up to a (1 ± ε) factor, which corresponds to Õ(ε -2 n 2 ) bits of space in general and an Õ(ε -2 n) space semi-streaming algorithm for constructing ℓ p flow sparsifiers on graphs. This extends to p ∈ (1, 2) with Õ(ε 2 n q/2 ) columns, where q is the Hölder conjugate exponent of p. For p = 2, we show that Ω(n 2 ) bits of space are required in general even for outputting a constant factor solution. For p = 1, we show that the cost cannot be estimated even to an o( √ n) factor in poly(n) space. On the other hand, if we are interested in outputting a solution 𝑥, then we show that (1 + ε)-approximations require Ω(d) space for p > 1, and in general, β -approximations require Ω(d/β 2q ) space for p > 1. We complement these lower bounds with the first sublinear space upper bounds for this problem, showing that we can output a β -approximation using space only poly(n) • Õ(d/β q ) for p > 1, as well as a √ n-approximation using poly(n, log d) space for p = 1.