CCS2018

The Influence of Code Coverage Metrics on Automated Testing Efficiency in Android

Stanislav Dashevskyi, Olga Gadyatskaya, Aleksandr Pilgun, Yury Zhauniarovich

7 citations

Abstract

For the 100 apps selected for the second experiment, we compute the number of faults detected jointly by 3 metrics and the number of faults found by each individual metric in 3 runs. The Influence of Code Coverage Metrics on Automated Testing Efficiency in Android Context Conclusions Automated testing and dynamic analysis techniques are critical for ensuring the reliability and the security of third-party Android apps. One of the biggest challenges for these techniques is effective app exploration in the black-box setting. Android apps have many entry points, and their source code is unavailable for inspection. State-of-the-art tools utilize a wide variety of app exploration strategies that range from generating random GUI events to systematic exploration of apps models [1], but there is no agreement on the success criteria. Code coverage is a common metric used to evaluate efficiency of automated testing and dynamic analysis tools [1], and some of these tools utilize code coverage as a component of a fitness function to guide app exploration and find more bugs [2] . Code coverage exists in many flavors, and there is currently no agreement in the community on which metrics to use in the fitness function. Are they all the same, or is there a code coverage granularity that works best? We make the first step towards reaching this agreement. Hypothesis Combining different granularities of code coverage can be beneficial for achieving better results in automated testing of Android apps. Experiment setting Sapienz [3] is a state-of-the-art bug finding tool for Android apps. It relies on Monkey [3] to generate random input events; and applies a genetic algorithm to event sequences. The test selection function combines code coverage, the number of found bugs, and the size of a test sequence. Sapienz is designed to utilize activity, method and statement coverage. We set out to evaluate how these metrics fare against each other in finding bugs. Activity coverage was computed by Sapienz, and method and instruction coverage were measured with our own ACVTool (the tool is currently available at https://github.com/pilgun/acvtool ).