STOC2025
Fingerprinting Codes Meet Geometry: Improved Lower Bounds for Private Query Release and Adaptive Data Analysis
Xin Lyu, Kunal Talwar
2 citations
Abstract
Fingerprinting codes are a crucial tool for proving lower bounds in differential privacy. They have been used to prove tight lower bounds for several fundamental questions, especially in the “low accuracy” regime. Unlike reconstruction/discrepancy approaches however, they are more suited for query sets that arise naturally from the fingerprinting codes construction. In this work, we propose a general framework for proving fingerprinting type lower bounds, that allows us to tailor the technique to the geometry of the query set. Our approach allows us to prove several new results, including the following. We show that any (sample- and population-)accurate algorithm for answering Q arbitrary adaptive counting queries over a universe X to accuracy α needs Ω(√log|X|· logQ/α3) samples, matching known upper bounds. This shows that the approaches based on differential privacy are optimal for this question, and improves significantly on the previously known lower bounds of logQ/α2 and min(√Q, √log|X|)/α2. We show that any (,δ)-DP algorithm for answering Q counting queries to accuracy α needs Ω(√ log|X| log(1/δ) logQ/ α2) samples, matching known upper bounds up to constants. Our framework allows for proving this bound via a direct correlation analysis and improves the prior bound of [] by √log(1/δ). For privately releasing a set of random 0-1 queries, we show tight sample complexity lower bounds in the high accuracy regime. In the low accuracy regime, the picture is more complex. For random queries, we show that there is a discontinuity in the sample complexity. For Q random queries over a universe , the sample complexity grows as Θ, δ(1/α2), with no dependence on Q or |X|. This new sample complexity bound, based on sparse histograms, is asymptotically better than known lower bounds for CDP. However, at α ≈ √log|X|/√Q, the sample complexity jumps to Θ,δ(√Q/α).