CCS2025

AI-Augmented Static Analysis: Bridging Heuristics and Completeness for Practical Reverse Engineering

Monika Santra

摘要

Reverse engineering poses significant challenges for several reasons, including the presence of interleaved code and data, the absence of names, types, and stack frames, aggressive compiler optimizations, and a range of obfuscation techniques. Traditional static analysis methods and existing tools have sought to mitigate the impact of these missing critical elements through various heuristic-based strategies. However, recent advancements in artificial intelligence (AI) have shown promise in tackling these challenges by uncovering complex hidden patterns from incomplete or low-level representations, particularly in predicting high-level semantic constructs that may be lost during compilation. Despite these advancements, AI-only solutions frequently struggle to deliver the completeness and reliability necessary for security-critical binary analysis. To address this shortcoming, we aim to establish an innovative synergy between AI and static analysis—leveraging AI to replace brittle heuristics for enhanced generalization while using static analysis to reinforce AI with a best-effort approach to completeness, thereby meeting the rigorous demands of security applications. In this thesis, we will concentrate on three essential tasks in reverse engineering that are notably underserved in both academic research and existing tools: instruction boundary identification, function boundary identification, and Control Flow Graph (CFG) construction, more specifically for indirect call targets. Our goal is to develop a novel integration of AI and static analysis, creating an end-to-end disassembly framework.