VLDB2023

Errata for "SpaceSaving±: An Optimal Algorithm for Frequency Estimation and Frequent Items in the Bounded-Deletion Model"

Fuheng Zhao, Divyakant Agrawal, Amr El Abbadi, Ahmed Metwally, Claire Mathieu, Michel de Rougemont

被引用 4 次

摘要

This errata article points out an implicit assumption in the work of four of us published in VLDB 2022. The SpaceSaving± algorithm in bounded deletion data stream presented in the paper implicitly assumed deletions happen after all insertions. When insertions and deletions are interleaved, that algorithm may severely underestimate item's frequency. We first illustrate this phenomenon by an example and then present a modified algorithm with minor changes to allow interleaving between insertions and deletions. We also include a pointer to a full analysis of the new algorithms.