NeurIPS2025
Towards Irreversible Attack: Fooling Scene Text Recognition via Multi-Population Coevolution Search
Jingyu Li, Pengwen Dai, Mingqing Zhu, Chengwei Wang, Haolong Liu, Xiaochun Cao
摘要
Recent work has shown that scene text recognition (STR) models are vulnerable to adversarial examples. Different from non-sequential vision tasks, the output sequence of STR models contains rich information. However, existing adversarial attacks against STR models can only lead to a few incorrect characters in the predicted text. These attack results still carry partial information about the original prediction and could be easily corrected by an external dictionary or a language model. Therefore, we propose the Multi-Population Coevolution Search (MPCS) method to attack each character in the image. We first decompose the global optimization objective into sub-objectives to solve the attack pixel concentration problem existing in previous attack methods. While this distributed optimization paradigm brings a new joint perturbation shift problem, we propose a novel coevolution energy function to solve it. Experiments on recent STR models show the superiority of our method. The code is available at https://github.com/Lee-Jingyu/MPCS .