SIGMOD2025

Testing Graph Databases with Synthesized Queries

Zijing Yin, Si Liu, David A. Basin

2 citations

Abstract

Graph databases (GDBs) are increasingly used in many applications. However, their advanced features make them prone to logic bugs. Despite advances in GDB testing, a common limitation of current approaches is the lack of ground truth for their test oracles. This results in both incorrectly identified bugs and overlooked bugs. We introduce GQS (Graph Query Synthesis), the first automated testing approach for detecting logic bugs in graph databases (GDBs) based on an established ground truth. GQS starts by randomly generating a graph and selecting a set of properties associated with its elements, whose key-value pairs form the expected result set serving as the ground truth. It then synthesizes a query that should retrieve these values from the graph. When the query is executed on the graph by the GDB under test, any discrepancy between the actual result set and the ground truth indicates a logic bug. To extensively test a GDB, we develop novel techniques that synthesize both syntactically and semantically complex queries. We implement GQS in a tool that incorporates the first Cypher query synthesizer specifically designed for testing GDBs. Our tool finds 36 previously unknown bugs across four production GDBs, of which 26 are logic bugs, with some remaining undetected for up to five years. Additionally, our tool demonstrates superior effectiveness in bug detection compared to the state-of-the-art testers.