VLDB2021
KLL±: Approximate Quantile Sketches over Dynamic Datasets
Fuheng Zhao, Sujaya Maiyya, Ryan Weiner, Divy Agrawal, Amr El Abbadi
36 citations
Abstract
Recently the long standing problem of optimal construction of quantile sketches was resolved by Karnin, Lang, and Liberty using the KLL sketch (FOCS 2016). The algorithm for KLL is restricted to online insert operations and no delete operations. For many real-world applications, it is necessary to support delete operations. When the data set is updated dynamically, i.e., when data elements are inserted and deleted, the quantile sketch should reflect the changes. In this paper, we propose KLL ± , the first quantile approximation algorithm to operate in the bounded deletion model to account for both inserts and deletes in a given data stream. KLL ± extends the functionality of KLL sketches to support arbitrary updates with small space overhead. The space bound for KLL ± is 𝑂 ( 𝛼 1.5 𝜖 𝑙𝑜𝑔 2 𝑙𝑜𝑔( 1 𝜖𝛿 )), where 𝜖 and 𝛿 are constants that determine precision and failure probability, and 𝛼 bounds the number of deletions with respect to insert operations. The experimental evaluation of KLL ± highlights that with minimal space overhead, KLL ± achieves comparable accuracy in quantile approximation to KLL.