SIGMOD2025
Are Database System Researchers Making Correct Assumptions about Transaction Workloads?
Cuong D. T. Nguyen, Kevin Chen, Christopher DeCarolis, Daniel J. Abadi
2 citations
Abstract
Many recent papers have contributed novel concurrency control and transaction processing algorithms that start by making an assumption about the transaction workload submitted by an application, and yield high performance (sometimes by an order of magnitude) under these assumptions. Two of the most common assumptions are (1) Is the read and write set of a transaction known (or easily derivable) directly by analyzing the application code in advance of transaction execution, or is the access set of a transaction dependent on the current state of the database? (2) Does the application send the entire transaction in a single request to the database system or is the transaction sent via several requests ''interactively'', with application code run in between these requests. The database community has made tremendous progress in improving throughput and latency of transaction processing when read/write sets are known in advance, and for non-interactive transactions. However, the impact of this progress is directly dependent on the accuracy of these assumptions both for current and future applications. In this paper, we conduct an extensive study of 111 open-source applications, analyzing over 30,000 transactions to evaluate the accuracy of these assumptions both as they exist in the current codebase, and how extensive are the changes required to the code for these assumptions to hold moving forward. Our study reveals that the second of these assumptions is stronger than the first. More specifically, for 90% of applications, at least 58% of transactions per application have read/write sets that can be inferred in advance. Furthermore, although only 39% of applications contain zero interactive transactions, nonetheless, the majority of the remaining 61% of applications can be converted to being completely non-interactive with minimal changes.These insights underscore the potential for further optimization and research in designing OLTP systems that balance transaction expressivity and performance.