SIGMOD2025

Focus! Fast On-disk Concurrency-control Using Sketches

Deukyeon Hwang, Alexander Conway, Carlos Garcia-Alvarado, Jun Yuan, Naama Ben-David, Rob Johnson, Adriana Szekeres

Abstract

Concurrency-control (CC) mechanisms are essential for ensuring consistency in large-scale key-value stores, but traditional approaches face significant challenges. Mechanisms like 2PL and OCC incur high CPU overheads. Timestamp-based mechanisms are faster but require storing timestamps for every key, resulting in substantial space overhead and numerous I/O operations in disk-based systems. We address these challenges by decomposing timestamp-based CC schemes into two components: a timestamp storage system and a CC protocol. We then show that the timestamp storage system can approximate timestamps for keys not used by ongoing transactions, substantially reducing memory requirements and I/O while maintaining correctness for various protocols (STO, MVTO, and TicToc). We introduce FPSketch, an approximate timestamp storage system, and our evaluation with SplinterDB shows that FPSketch outperforms 2PL and OCC by up to 14× on some workloads and disk-based CC systems by up to 5.9×. Remarkably, FPSketch with just 32KiB of memory yields performance comparable to an idealized in-memory implementations in our evaluation. FPSketch makes timestamp-based concurrency control mechanisms practical for disk-based key-value stores.