ISSTA2024
See the Forest, not Trees: Unveiling and Escaping the Pitfalls of Error-Triggering Inputs in Neural Network Testing
Yuanyuan Yuan, Shuai Wang, Zhendong Su
1 citation
Abstract
Recent efforts in deep neural network (DNN) testing commonly use error-triggering inputs (ETIs) to quantify DNN errors and to fine-tune the tested DNN for repairing. This study reveals the pitfalls of ETIs in DNN testing. Specifically, merely seeking for more ETIs “traps” the testing campaign into local plateaus, where similar ETIs are continuously generated using a few fixed input transformations. Similarly, fine-tuning the DNN with ETIs, while capable of fixing the exposed DNN mis-predictions, undermines the DNN’s resilience towards certain input transformations. However, these ETI-induced pitfalls have been overlooked in previous research, due to the insufficient input transformations (usually < 10), and we show that the severity of such deceptive phenomena is enlarged when testing DNNs with more and diverse real-life input transformations. This paper presents a comprehensive study on the pitfalls of ETIs in DNN testing. We first augment conventional DNN testing pipelines with a large set of input transformations; the correctness and validity of these new transformations are verified with large-scale human studies. Based on this, we show that launching an endless pursuit for ETIs cannot alleviate the “trapped testing” issue, and the undermined resilience pervasively occurs in many input transformations. Accordingly, we propose a novel and holistic viewpoint over DNN errors: instead of counting which input triggers a DNN mis-prediction, we record which input transformation can generate ETIs. The targeted input property of this transformation, termed erroneous property (EP), counts one DNN error and guides DNN testing (i.e., our new paradigm aims to find more EPs rather than ETIs). Evaluation shows that this EP-oriented testing paradigm significantly expands the explored DNN error space. Moreover, fine-tuning DNNs with EPs effectively improves their resilience towards different input transformations.