ICML2022

Provably Adversarially Robust Nearest Prototype Classifiers

Václav Vorácek, Matthias Hein

被引用 15 次

摘要

Nearest prototype classiﬁers (NPCs) assign to each input point the label of the nearest prototype with respect to a chosen distance metric. A direct advantage of NPCs is that the decisions are interpretable. Previous work could provide lower bounds on the minimal adversarial perturbation in the (cid:96) p -threat model when using the same (cid:96) p distance for the NPCs. In this paper we provide a complete discussion on the complexity when using (cid:96) p -distances for decision and (cid:96) q -threat models for certiﬁcation for p, q ∈ 1 , 2 , ∞ . In particular we provide scalable algorithms for the exact computation of the minimal adversarial perturbation when using (cid:96) 2 -distance and improved lower bounds in other cases. Using efﬁcient improved lower bounds we train our P rovably adversarially robust NPC (PNPC), for MNIST which have better (cid:96) 2 -robustness guarantees than neural networks. Additionally, we show up to our knowledge the ﬁrst certiﬁcation results w.r.t. to the LPIPS perceptual metric which has been argued to be a more realistic threat model for image classiﬁcation than (cid:96) p -balls. Our PNPC has on CIFAR10 higher certiﬁed robust accuracy than the empirical robust accuracy reported in (Laidlaw et al., 2021). The code is available in our repository.