NeurIPS2021

Can we have it all? On the Trade-off between Spatial and Adversarial Robustness of Neural Networks

Sandesh Kamath, Amit Deshpande, Subrahmanyam Kambhampati Venkata, Vineeth N. Balasubramanian

13 citations

Abstract

Non-)robustness of neural networks to small, adversarial pixel-wise perturbations, and as more recently shown, to even random spatial transformations (e.g., translations, rotations) entreats both theoretical and empirical understanding. Spatial robustness to random translations and rotations is commonly attained via equivariant models (e.g., StdCNNs, GCNNs) and training augmentation, whereas adversarial robustness is typically achieved by adversarial training. In this paper, we prove a quantitative trade-off between spatial and adversarial robustness in a simple statistical setting. We complement this empirically by showing that: (a) as the spatial robustness of equivariant models improves by training augmentation with progressively larger transformations, their adversarial robustness worsens progressively, and (b) as the state-of-the-art robust models are adversarially trained with progressively larger pixel-wise perturbations, their spatial robustness drops progressively. Towards achieving Pareto-optimality in this trade-off, we propose a method based on curriculum learning that trains gradually on more difficult perturbations (both spatial and adversarial) to improve spatial and adversarial robustness simultaneously. Spatial-Adversarial Robustness Trade-off In this section, we prove the trade-off between spatial and adversarial robustness theoretically, and support this result with experiments in Sec 4. We use A(x) to denote an adversarial ∞ perturbation and r(x) to denote a random spatial transformation. Equivariant model constructions often consider a group of transformations and construct a neural network model invariant to this group. For simplicity, we consider a cyclic group that can model a rotation group (e.g., integer multiples of 30 • ), or horizontal/vertical translations (e.g., horizontal translations by multiples of, say, ±4 pixels).