ICML2021

Provable Robustness of Adversarial Training for Learning Halfspaces with Noise

Difan Zou, Spencer Frei, Quanquan Gu

被引用 15 次

摘要

We analyze the properties of adversarial training for learning adversarially robust halfspaces in the presence of agnostic label noise. Denoting OPTp,r\mathsf{OPT}_{p,r} as the best robust classification error achieved by a halfspace that is robust to perturbations of p\ell_{p} balls of radius rr, we show that adversarial training on the standard binary cross-entropy loss yields adversarially robust halfspaces up to (robust) classification error O~(OPT2,r)\tilde O(\sqrt{\mathsf{OPT}_{2,r}}) for p=2p=2, and O~(d1/4OPT,r+d1/2OPT,r)\tilde O(d^{1/4} \sqrt{\mathsf{OPT}_{\infty, r}} + d^{1/2} \mathsf{OPT}_{\infty,r}) when p=p=\infty. Our results hold for distributions satisfying anti-concentration properties enjoyed by log-concave isotropic distributions among others. We additionally show that if one instead uses a nonconvex sigmoidal loss, adversarial training yields halfspaces with an improved robust classification error of O(OPT2,r)O(\mathsf{OPT}_{2,r}) for p=2p=2, and O(d1/4OPT,r)O(d^{1/4}\mathsf{OPT}_{\infty, r}) when p=p=\infty. To the best of our knowledge, this is the first work to show that adversarial training provably yields robust classifiers in the presence of noise.