ICML2022
Improving Robustness against Real-World and Worst-Case Distribution Shifts through Decision Region Quantification
Leo Schwinn, Leon Bungert, An Nguyen, René Raab, Falk Pulsmeyer, Doina Precup, Bjoern M. Eskofier, Dario Zanca
被引用 19 次
摘要
The reliability of neural networks is essential for their use in safety-critical applications. Existing approaches generally aim at improving the robustness of neural networks to either real-world distribution shifts (e.g., common corruptions and perturbations, spatial transformations, and natural adversarial examples) or worst-case distribution shifts (e.g., optimized adversarial examples). In this work, we propose the Decision Region Quantification (DRQ) algorithm to improve the robustness of any differentiable pre-trained model against both real-world and worst-case distribution shifts in the data. DRQ analyzes the robustness of local decision regions in the vicinity of a given data point to make more reliable predictions. We theoretically motivate the DRQ algorithm by showing that it effectively smooths spurious local extrema in the decision surface. Furthermore, we propose an implementation using targeted and untargeted adversarial attacks. An extensive empirical evaluation shows that DRQ increases the robustness of adversarially and nonadversarially trained models against real-world and worst-case distribution shifts on several computer vision benchmark datasets. * Work was partially done during an internship at the Mila -Quebec AI Institute els. Distribution shifts of the data can be separated into multiple categories, such as out-of-distribution data (Hendrycks & Gimpel, 2017), common corruptions (Hendrycks et al., 2021a), and adversarial examples (Madry et al., 2018). Making deep learning models robust against distribution shifts in the data is a long-standing problem (Quiñonero-Candela et al., 2008) . One area of research focuses on improving the robustness of models at training-time. Here, models are trained with specific procedures (e.g., data augmentations or adversarial training), which cannot be applied to pre-trained models (Madry et al., 2018; Yin et al., 2019; Geirhos et al., 2019; Hendrycks et al., 2020) . These methods usually entail vast computational overhead and require access to the training data (Madry et al., 2018; Rebuffi et al., 2021) . Another set of methods aims to improve the robustness at test-time. However, these approaches are usually semi-supervised, rely on assumptions on the distribution of the test data, and often require additional model training at test-time (Li et al., 2017; Shorten & Khoshgoftaar, 2019; Zhang et al., 2021) . Moreover, most prior work focuses either on the robustness against real-world distribution shifts (e.g., common corruptions and perturbations, spatial transformations, and natural adversarial examples) or worst-case distribution shifts (e.g., optimized adversarial examples). In this work, we propose the Decision Region Quantification (DRQ) algorithm that analyses the decision surface of a given model to improve its predictions. Unlike previous work, which mainly specialized in one threat model at a time, DRQ simultaneously improves the robustness of models to both real-world and worst-case distribution shifts. The algorithm is illustrated in Figure 1 . Remarkably, the proposed approach does not require any further training data and can be directly combined with pre-trained models during test-time. Additionally, no batch statistics are used during test-time and a single test sample is sufficient for DRQ-based inference. Our contributions can be summarized as follows: First, we theoretically motivate the proposed DRQ algorithm by demonstrating its ability to smooth small spurious local extrema in the decision surface. Additionally, we provide an implementation using targeted and untargeted adversarial