SIGMOD2025
Efficient and Accurate Differentially Private Cardinality Continual Releases
Dongdong Xie, Pinghui Wang, Quanqing Xu, Chuanhui Yang, Rundong Li
1 citation
Abstract
Accurately estimating the number of unique elements that appear in data streams in real time is a fundamental problem with applications including network traffic monitoring and real-time social media analytics. Traditional sketch-based algorithms such as FM Sketch and HyperLogLog offer memory-friendly solutions for cardinality estimation but fall short in scenarios where the stream elements are privacy-sensitive and require differential privacy. Although recent approaches have incorporated differential privacy into the above cardinality estimators, they are limited to single-query settings, restricting their applicability. Previous methods for private cardinality continual release settings-i.e., releasing the cardinality after each new element in the stream-demand large memory resources and are thus difficult to apply in practice. In this paper, we present a novel cardinality estimation framework, FC, which ensures differential privacy under continual releases while simultaneously achieving low memory usage, high accuracy, and efficient computation. Our approach innovatively leverages an efficient cardinality estimator and privacy-preserving mechanisms to overcome the limitations of existing methods. Comprehensive experiments demonstrate that our method reduces memory usage by up to 504 times compared to the best previous method while maintaining nearly the same accuracy. Additionally, under identical memory constraints, our method improves the estimation accuracy by orders of magnitude.