CVPR2025

Doppelgangers and Adversarial Vulnerability

George I. Kamberov

摘要

Machine learning (ML) classifiers can make mistakes that are perceptually and cognitively disturbing to humans. The most notorious examples of such errors are adversarial visual metamers. This paper investigates the phenomenon of adversarial Doppelgängers (AD), which encompasses adversarial visual metamers, and compares the performance and robustness of ML classifiers to human performance. We find that ADs are inputs that are close to each other with respect to a perceptual metric defined in this paper, and show that ADs are qualitatively different from the usual adversarial examples. The vast majority of classifiers are vulnerable to ADs and robustness-accuracy trade-offs may not improve them. Some classification problems do not admit any AD-robust classifiers because the underlying classes are ambiguous. We provide criteria to determine whether a classification problem is well defined; describe the structure and attributes of AD-robust classifiers; introduce and explore the notions of conceptual entropy and regions of conceptual ambiguity for classifiers that are vulnerable to AD attacks; and discuss methods to bound the AD fooling rate of an attack. We define the notion of classifiers that exhibit hypersensitive behavior, that is, classifiers whose only mistakes are adversarial Doppelgängers. Improving the AD robustness of hypersensitive classifiers is equivalent to improving accuracy. We identify conditions guaranteeing that all classifiers with sufficiently high accuracy are hypersensitive.