ISSTA2023
DeepAtash: Focused Test Generation for Deep Learning Systems
Tahereh Zohdinasab, Vincenzo Riccio, Paolo Tonella
被引用 17 次
摘要
When deployed in the operation environment, Deep Learning (DL) systems often experience the so-called development to operation (dev2op) data shift, which causes a lower prediction accuracy on field data as compared to the one measured on the test set during development. To address the dev2op shift, developers must obtain new data with the newly observed features, as these are under-represented in the train/test set, and must use them to fine tune the DL model, so as to reach the desired accuracy level. In this paper, we address the issue of acquiring new data with the specific features observed in operation, which caused a dev2op shift, by proposing DeepAtash, a novel search-based focused testing approach for DL systems. DeepAtash targets a cell in the feature space, defined as a combination of feature ranges, to generate misbehaviour-inducing inputs with predefined features. Experimental results show that DeepAtash was able to generate up to 29X more targeted, failure-inducing inputs than the baseline approach. The inputs generated by DeepAtash were useful to significantly improve the quality of the original DL systems through fine tuning not only on data with the targeted features, but quite surprisingly also on inputs drawn from the original distribution.