ICLR2025
Matrix Product Sketching via Coordinated Sampling
Majid Daliri, Juliana Freire, Danrong Li, Christopher Musco
Abstract
We revisit the well-studied problem of approximating a matrix product, , based on small space sketches and of and . We are interested in the setting where the sketches must be computed independently of each other, except for the use of a shared random seed. We prove that, when and are sparse, methods based on coordinated random sampling can outperform classical linear sketching approaches, like Johnson-Lindenstrauss Projection or CountSketch. For example, to obtain Frobenius norm error , coordinated sampling requires sketches of size when and have at most non-zeros per row. In contrast, linear sketching leads to sketches of size and for and . We empirically evaluate our approach on two applications: 1) distributed linear regression in databases, a problem motivated by tasks like dataset discovery and augmentation, and 2) approximating attention matrices in transformer-based language models. In both cases, our sampling algorithms yield an order of magnitude improvement over linear sketching.