SIGMOD2025
Perseus: Achieving Strong Consistency and High Data Freshness for Scalable Geo-distributed HTAP
Haoze Song, Xusheng Chen, Ruijie Gong, Zekai Sun, Tianxiang Shen, Cheng Li, Hao Feng, Sen Wang, Heming Cui
Abstract
The rise of global data-driven applications has made geo-distributed hybrid transactional and analytical processing (HTAP) databases increasingly desirable. Existing distributed HTAP systems provide users with good performance on both transactions and analytical queries, and this good performance is scalable across a large number of data nodes. Unfortunately, these systems either provide weak consistency or incur bad data freshness when deployed geographically. In this paper, we present P erseus , a scalable HTAP database that enforces strong consistency for both transactions and analytical queries. To handle consistency efficiently, P erseus augments the classical dependency graph in concurrency control protocols to explicitly record the versions of data and their complete dependencies, implying which data needs to be read together in a snapshot. To minimize data staleness on analytical queries (another important goal of HTAP), P erseus further introduces a new dynamic snapshot algorithm that chooses updates selectively. Extensive evaluation results show that, compared to the HTAP databases with even weaker consistency, P erseus achieves up to 90% lower visibility delay, a metric of data freshness, capturing the time interval during which transactional updates are committed to the database and can be visible to analytical queries. Besides, Perseus is scalable across many nodes and robust to network instability.