STOC2025
Harmonic Decomposition in Data Sketches
Dingyu Wang
1 citation
Abstract
In the turnstile streaming model, a dynamic vector x = (x 1 , . . . , x n ) ∈ Z n is updated by a stream of entry-wise increments/decrements. Let f : Z → R + be a symmetric function with We revisit the problem of constructing a universal sketch that can estimate many different f -moments. Previous constructions of universal sketches rely on the technique of sampling with respect to the L 0 -mass (uniform samples) or L 2 -mass (L 2 -heavy-hitters), whose universality comes from being able to evaluate the function f over the samples. To get samples, hash collisions are deliberately detected and avoided (with high probability), e.g. singleton-detectors are used in L 0 -sampling and the CountSketch is used in L 2 -sampling. Such auxiliary data structures introduce significant overhead in space. Apart from this issue, sampling-based methods are shown to perform poorly for estimating certain "nearly periodic functions" where Ω(poly(n)) samples are needed. In this paper, we propose a new universal sketching scheme that is almost "dual" to the sampling-based methods. Instead of evaluating f on samples, we decompose f into a linear combination of homomorphisms f 1 , f 2 , . . . from (Z, +) to (C, ×), where the f k -moments can be estimated regardless of hash collisions-because each f k is a homomorphism! Then we synthesize the estimates of the f k -moments to obtain an estimate of the f -moment. Universality now comes from the fact that we can weight the f k -moments arbitrarily, where the correct weighting depends on the harmonic structure of the function f . In contrast to the sampling-based methods, the new SymmetricPoissonTower sketch takes the harmonic approach. It embraces hash collisions instead of avoiding them, which saves multiple log n factors in space, e.g., when estimating all L p -moments (f (z) = |z| p , p ∈ [0, 2]). For many nearly periodic functions, the SymmetricPoissonTower is exponentially more efficient than sampling-based methods. We conjecture that the SymmetricPoissonTower is the universal sketch that can estimate every tractable function f .