ICML2022
Provably Adversarially Robust Nearest Prototype Classifiers
Václav Vorácek, Matthias Hein
被引用 15 次
摘要
Nearest prototype classifiers (NPCs) assign to each input point the label of the nearest prototype with respect to a chosen distance metric. A direct advantage of NPCs is that the decisions are interpretable. Previous work could provide lower bounds on the minimal adversarial perturbation in the (cid:96) p -threat model when using the same (cid:96) p distance for the NPCs. In this paper we provide a complete discussion on the complexity when using (cid:96) p -distances for decision and (cid:96) q -threat models for certification for p, q ∈ 1 , 2 , ∞ . In particular we provide scalable algorithms for the exact computation of the minimal adversarial perturbation when using (cid:96) 2 -distance and improved lower bounds in other cases. Using efficient improved lower bounds we train our P rovably adversarially robust NPC (PNPC), for MNIST which have better (cid:96) 2 -robustness guarantees than neural networks. Additionally, we show up to our knowledge the first certification results w.r.t. to the LPIPS perceptual metric which has been argued to be a more realistic threat model for image classification than (cid:96) p -balls. Our PNPC has on CIFAR10 higher certified robust accuracy than the empirical robust accuracy reported in (Laidlaw et al., 2021). The code is available in our repository.