NeurIPS2020

Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Evolvability

Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau

24 citations

Abstract

In this paper we revisit some classic problems on classification under misspecification. In particular, we study the problem of learning halfspaces where we are given labeled examples (X, Y ), where X is distributed arbitrarily and the labels Y are corrupted with Massart noise with rate η. In a recent work, Diakonikolas, Goulekakis, and Tzamos [DGT19] resolved a longstanding problem by giving the first efficient algorithm for learning to accuracy η + ϵ for any ϵ > 0. However, their algorithm outputs a complicated hypothesis, which partitions space into a polynomial number of regions growing with poly(d, 1/ϵ). Here we give a much simpler algorithm and in the process resolve a number of outstanding open questions: (1) We give the first proper learning algorithm for Massart halfspaces that achieves error η + ϵ for any ϵ > 0. We also give improved bounds on the sample complexity achievable by polynomial time algorithms. (2) Based on (1), we develop a blackbox knowledge distillation procedure which converts any classifier, possibly improper and arbitrarily complex, to a proper halfspace with equally good prediction accuracy. (3) We show the first lower bounds for achieving optimal accuracy. In particular, we show a superpolynomial statistical query lower bound for achieving OPT + ϵ error where OPT is the misclassification error of the best halfspace. Our lower bound is based on a simple but previously overlooked connection to the notion of evolvability.