STOC2025
A Framework for Building Data Structures from Communication Protocols
Alexandr Andoni, Shunhua Jiang, Omri Weinstein
Abstract
We present a general framework for designing efficient data structures for high-dimensional pattern-matching problems () through communication models in which admits sublinear communication protocols with exponentially-small error. Specifically, we reduce the data structure problem to the Unambiguous Arthur-Merlin (UAM) communication complexity of under product distributions. We apply our framework to the Partial Match problem (a.k.a, matching with wildcards), whose underlying communication problem is sparse set-disjointness. When the database consists of points in dimension , and the number of 's in the query is at most , the fastest known linear-space data structure (Cole, Gottlieb and Lewenstein, STOC'04) had query time , which is nontrivial only when . By contrast, our framework produces a data structure with query time and space close to linear. To achieve this, we develop a one-sided -error communication protocol for Set-Disjointness under product distributions with complexity, improving on the classical result of Babai, Frankl and Simon (FOCS'86). Building on this protocol, we show that the Unambiguous AM communication complexity of -Sparse Set-Disjointness with -error under product distributions is , independent of the ambient dimension , which is crucial for the Partial Match result. Our framework sheds further light on the power of data-dependent data structures, which is instrumental for reducing to the (much easier) case of product distributions.