NeurIPS2020

Once-for-All Adversarial Training: In-Situ Tradeoff between Robustness and Accuracy for Free

Haotao Wang, Tianlong Chen, Shupeng Gui, Ting-Kuei Hu, Ji Liu, Zhangyang Wang

91 citations

Abstract

Adversarial training and its many variants substantially improve deep network robustness, yet at the cost of compromising standard accuracy. Moreover, the training process is heavy and hence it becomes impractical to thoroughly explore the trade-off between accuracy and robustness. This paper asks this new question: how to quickly calibrate a trained model in-situ, to examine the achievable trade-offs between its standard and robust accuracies, without (re-)training it many times? Our proposed framework, Once-for-all Adversarial Training (OAT), is built on an innovative model-conditional training framework, with a controlling hyper-parameter as the input. The trained model could be adjusted among different standard and robust accuracies "for free" at testing time. As an important knob, we exploit dual batch normalization to separate standard and adversarial feature statistics, so that they can be learned in one model without degrading performance. We further extend OAT to a Once-for-all Adversarial Training and Slimming (OATS) framework, that allows for the joint trade-off among accuracy, robustness and runtime efficiency. Experiments show that, without any re-training nor ensembling, OAT/OATS achieve similar or even superior performance compared to dedicatedly trained models at various configurations. Our codes and pretrained models are available at: https://github.com/VITA-Group/Once-for-All-Adversarial-Training . Motivation and background Deep neural networks (DNNs) are nowadays well-known to be vulnerable to adversarial examples [1, 2] . With the growing usage of DNNs on security sensitive applications, such as self-driving [3] and bio-metrics [4], a critical concern has been raised to carefully examine the worst-case accuracy of deployed DNNs on crafted attacks (denoted as robust accuracy, or robustness for short, following [5] ), in addition to their average accuracy on standard inputs (denoted as standard accuracy, or accuracy for short). Among a variety of adversarial defense methods proposed to enhance DNN robustness, adversarial training (AT) based methods [5, 6, 7] are consistently top-performers. While adversarial defense methods are gaining increasing attention and popularity in safety/securitycritical applications, their downsides are also noteworthy. Firstly, most adversarial defense methods, including adversarial training, come at the price of compromising the standard accuracy [8] . That * The first two authors contributed equally. 34th Conference on Neural Information Processing Systems (NeurIPS 2020),