ICSE2021

Sustainable Solving: Reducing The Memory Footprint of IFDS-Based Data Flow Analyses Using Intelligent Garbage Collection

Steven Arzt

9 citations

Abstract

The IFDS algorithm can be both memory-and compute-intensive for large programs as it needs to store a huge amount of path edges in memory and process them until a xed point. In general, an IFDSbased data-ow analysis, such as taint analysis, aims to discover only the data-ow facts at some program points. Maintaining a huge amount of path edges (with many visited only once) wastes memory resources, and consequently, reduces its scalability and e ciency (due to frequent re-hashings for the path-edge data structure used). This paper introduces a ne-grained garbage collection (GC) algorithm to enable (multi-threaded) IFDS to reduce its memory footprint by removing non-live path edges (i.e., ones that are no longer needed for establishing other path edges) from its path-edge data structure. The resulting IFDS algorithm, named Fpc, retains the correctness, precision, and termination properties of IFDS while avoiding re-processing GC'ed path edges redundantly (in the presence of unknown recursive cycles that may be formed in future iterations of the analysis). Unlike CleanDroid, which augments IFDS with a coarse-grained GC algorithm to collect path edges at the method level, Fpc is ne-grained by collecting path edges at the data-fact level. As a result, Fpc can collect more path edges than CleanDroid, and consequently, cause fewer re-hashings for the path-edge data structure used. In our evaluation, we focus on applying an IFDS-based taint analysis to a set of 28 Android apps. Fpc can scalably analyze three apps that CleanDroid fails to run to completion (under a 3-hour budget per app) due to out-of-memory (OoM). For the remaining 25 apps, Fpc reduces the number of path edges and memory usage incurred under CleanDroid by 4.4× and 1.4× on average, respectively, and consequently, outperforms CleanDroid by 1.7× on average (with 18.5× in the best case).