NeurIPS2020
The Flajolet-Martin Sketch Itself Preserves Differential Privacy: Private Counting with Minimal Space
Adam D. Smith, Shuang Song, Abhradeep Thakurta
44 citations
Abstract
We revisit the problem of counting the number of distinct elements F 0 (D) in a data stream D, over a domain [u]. We propose an (ε, δ)-differentially private algorithm that approximates F 0 (D) within a factor of (1 ± γ), and with additive error of O( ln(1/δ)/ε), using space O(ln(ln(u)/γ)/γ 2 ). We improve on the prior work at least quadratically and up to exponentially, in terms of both space and additive error. Our additive error guarantee is optimal up to a factor of O( ln(1/δ)), and the space bound is optimal up to a factor of O min ln ln(u) . We assume the existence of an ideal uniform random hash function, and ignore the space required to store it. We later relax this requirement by assuming pseudorandom functions and appealing to a computational variant of differential privacy, SIM-CDP. Our algorithm is built on top of the celebrated Flajolet-Martin (FM) sketch. We show that FM-sketch is differentially private as is, as long as there are ≈ ln(1/δ)/(εγ) distinct elements in the data set. Along the way, we prove a structural result showing that the maximum of k i.i.d. random variables is statistically close (in the sense of ε-differential privacy) to the maximum of (k + 1) i.i.d. samples from the same distribution, as long as k = Ω 1 ε . Finally, experiments show that our algorithms introduces error within an order of magnitude of the non-private analogues for streams with thousands of distinct elements, even while providing strong privacy guarantee (ε ≤ 1).