NeurIPS2021

Towards a Unified Game-Theoretic View of Adversarial Perturbations and Robustness

Jie Ren, Die Zhang, Yisen Wang, Lu Chen, Zhanpeng Zhou, Yiting Chen, Xu Cheng, Xin Wang, Meng Zhou, Jie Shi, Quanshi Zhang

被引用 26 次

出版方

摘要

How to explain adversarial robustness from the perspective of feature representation? • We discover that adversarial attacks mainly affect high-order interactions between input variables. • Adversarial training boosts the robustness of DNNs by learning more discriminative low-order interactions. • We proposed a unified explanation for several adversarial defense methods.