CVPR2025

Enhancing Adversarial Transferability with Checkpoints of a Single Model's Training

Shixin Li, Chaoxiang He, Xiaojing Ma, Bin Benjamin Zhu, Shuo Wang, Hongsheng Hu, Dongmei Zhang, Linchen Yu

Abstract

Adversarial attacks threaten the integrity of deep neural networks (DNNs), particularly in high-stakes applications. In this paper, we present a novel black-box adversarial attack that leverages the diverse checkpoints generated during a single model's training trajectory. Unlike conventional ensemble attacks that require multiple surrogate models with diverse architectures, our approach exploits the intrinsic diversity captured over different training stages of a single surrogate model. By decomposing the learned representations into task-intrinsic and task-irrelevant components, we employ an accuracy gap-based selection strategy to identify checkpoints that predominantly capture transferable, task-intrinsic knowledge. Extensive experiments on ImageNet and CIFAR-10 demonstrate that our method consistently outperforms traditional ensemble attacks in terms of transferability, even under resource-constrained and practical settings. This work offers a resource-efficient solution for crafting highly transferable adversarial examples and provides new insights into the dynamics of adversarial vulnerability.