STOC2024

Sparsifying Generalized Linear Models

Arun Jambulapati, James R. Lee, Yang P. Liu, Aaron Sidford

被引用 1 次

摘要

We consider the sparsification of sums F : ℝn → ℝ+ where F(x) = f1(⟨ a1,x⟩) + ⋯ + fm(⟨ am,x⟩) for vectors a1,…,am ∈ ℝn and functions f1,…,fm : ℝ → ℝ+. We show that (1+ε)-approximate sparsifiers of F with support size n/ε2 (logn/ε)O(1) exist whenever the functions f1,…,fm are symmetric, monotone, and satisfy natural growth bounds. Additionally, we give efficient algorithms to compute such a sparsifier assuming each fi can be evaluated efficiently. Our results generalize the classical case of ℓp sparsification, where fi(z) = |z|p, for p ∈ (0, 2], and give the first near-linear size sparsifiers in the well-studied setting of the Huber loss function and its generalizations, e.g., fi(z) = min|z|p, |z|2 for 0 < p ≤ 2. Our sparsification algorithm can be applied to give near-optimal reductions for optimizing a variety of generalized linear models including ℓp regression for p ∈ (1, 2] to high accuracy, via solving (logn)O(1) sparse regression instances with m ≤ n(logn)O(1), plus runtime proportional to the number of nonzero entries in the vectors a1, …, am.