SIGMOD2025

NEXT: A New Secondary Index Framework for LSM-based Data Storage

Jiachen Shi, Jingyi Yang, Gao Cong, Xiaoli Li

被引用 1 次

摘要

Key-value databases with Log Structured Merge tree are increasingly favored by modern applications. Apart from supporting fast lookup on primary key, efficient queries on non-key attributes are also highly demanded by many of these applications. To enhance query performance, many auxiliary structures like secondary indexing and filters have been developed. However, existing auxiliary structures suffer from three limitations. First, creating filter for every disk component has low lookup efficiency as all components need to be searched during query processing. Second, current secondary index design requires primary table access to fetch the data entries for each output primary key from the index. This indirect entries fetching process involves significant point lookup overhead in the primary table and hence hinders the query performance. Last, maintaining the consistency between the secondary index and the primary table is challenging due to the out-of-place update mechanism of the LSM-tree. To overcome the limitations in existing auxiliary structures for non-key attributes queries, this paper proposes a novel secondary index framework, NEXT, for LSM-based key-value storage system. NEXT utilizes a two-level structure which is integrated with the primary table. In particular, NEXT proposes to create secondary index blocks on each LSM disk component to map the secondary attributes to their corresponding data blocks. In addition, NEXT introduces a global index component which is created on top of all secondary index blocks to direct the secondary index operation to the target secondary index blocks. Finally, NEXT adopts two optimization strategies to further improve the query performance. We implement NEXT on RocksDB and experimentally evaluate its performance against existing methods. Experiments on both static and mixed workloads demonstrate that NEXT outperforms existing methods for different types of non-key attributes.