ICML2023

Better Diffusion Models Further Improve Adversarial Training

Zekai Wang, Tianyu Pang, Chao Du, Min Lin, Weiwei Liu, Shuicheng Yan

被引用 300 次

摘要

It has been recognized that the data generated by the denoising diffusion probabilistic model (DDPM) improves adversarial training. After two years of rapid development in diffusion models, a question naturally arises: can better diffusion models further improve adversarial training? This paper gives an affirmative answer by employing the most recent diffusion model which has higher efficiency ( $\sim 20$ sampling steps) and image quality (lower FID score) compared with DDPM. Our adversarially trained models achieve state-of-the-art performance on RobustBench using only generated data (no external datasets). Under the $\ell_\infty$ -norm threat model with $\epsilon=8/255$ , our models achieve $70.69\%$ and $42.67\%$ robust accuracy on CIFAR-10 and CIFAR-100, respectively, i.e. improving upon previous state-of-the-art models by $+4.58\%$ and $+8.03\%$ . Under the $\ell_2$ -norm threat model with $\epsilon=128/255$ , our models achieve $84.86\%$ on CIFAR-10 ( $+4.44\%$ ). These results also beat previous works that use external data. We also provide compelling results on the SVHN and TinyImageNet datasets. Our code is available at https://github.com/wzekai99/DM-Improves-AT.