ASE2022

An Empirical Study on Numerical Bugs in Deep Learning Programs

Gan Wang, Zan Wang, Junjie Chen, Xiang Chen, Ming Yan

被引用 15 次

摘要

The task of a deep learning (DL) program is to train a model with high precision and apply it to different scenarios. A DL program often involves massive numerical calculations. Therefore, the robustness and stability of the numerical calculations are dominant in the quality of DL programs. Indeed, numerical bugs are common in DL programs, producing NaN (Not-a-Number) and INF (Infinite). A numerical bug may render the DL models inaccurate, causing the DL applications unusable. In this work, we conduct the first empirical study on numerical bugs in DL programs by analyzing the programs implemented on the top of two popular DL libraries (i.e., TensorFlow and PyTorch). Specifically, We collect a dataset of 400 numerical bugs in DL programs. Then, we classify these numerical bugs into nine categories based on their root causes and summarize two findings. Finally, we provide the implications of our study on detecting numerical bugs in DL programs. CCS CONCEPTS • Software and its engineering → Software testing and debugging; Empirical software validation; • Computing methodologies → Artificial intelligence.