NeurIPS2022

List-Decodable Sparse Mean Estimation

Shiwei Zeng, Jie Shen

11 citations

Abstract

Robust mean estimation is one of the most important problems in statistics: given a set of samples in R d where an α fraction are drawn from some distribution D and the rest are adversarially corrupted, we aim to estimate the mean of D. A surge of recent research interest has been focusing on the list-decodable setting where α ∈ (0, 1 2 ], and the goal is to output a finite number of estimates among which at least one approximates the target mean. In this paper, we consider that the underlying distribution D is Gaussian with k-sparse mean. Our main contribution is the first polynomial-time algorithm that enjoys sample complexity O poly(k, log d) , i.e. poly-logarithmic in the dimension. One of our core algorithmic ingredients is using low-degree sparse polynomials to filter outliers, which may find more applications. Learning with overwhelming corruption (α ≤ 1/2). The agnostic label noise of [Hau92, KSS92] seems the earliest model that allows the adversary to arbitrarily corrupt any fraction of