SIGMOD2025
LSM-Raft: Optimizing Raft for LSM-tree Store
Xiaojian Zhang, Xinyu Tan, Shaoxu Song, Xiangdong Huang, Jianmin Wang
Abstract
The Raft consensus algorithm ensures strong consistency by replicating a totally ordered log of operations. However, this strict ordering introduces significant redundancy in databases built on Log-Structured Merge (LSM) trees, where follower-side log transmission and compaction often duplicate storage and computation. In this paper, we present LSM-Raft, a generalized extension of Raft that models the replicated log as a sequence of elements-comprising both atomic entries and compacted SSTables. By aligning log semantics and state machine behavior with LSM-tree principles, LSM-Raft enables more compact and efficient replication while preserving correctness. We generalize Raft's core safety properties to accommodate the relaxed log structure and formally verify them using TLA+. Moreover, we propose a cost-aware synchronization strategy that dynamically replaces delayed raw entries with semantically equivalent SSTables, minimizing transmission and compaction overhead. Experimental results show that LSM-Raft improves overall throughput and significantly reduces resource utilization under real-world workloads, demonstrating its practicality in high-ingest, compaction-intensive environments.