NeurIPS2024

Euclidean distance compression via deep random features

Brett Leroux, Luis Rademacher

Abstract

Motivated by the problem of compressing point sets into as few bits as possible while maintaining information about approximate distances between points, we construct random nonlinear maps $\varphi_\ell$ that compress point sets in the following way. For a point set $S$ , the map $\varphi_\ell:\mathbb{R}^d \to N^{-1/2}\{-1,1\}^N$ has the property that storing $\varphi_\ell(S)$ (a sketch of $S$ ) allows one to report pairwise squared distances between points in $S$ up to some multiplicative $(1\pm \epsilon)$ error with high probability as long as the minimum distance is not too small compared to $\epsilon$ . The maps $\varphi_\ell$ are the $\ell$ -fold composition of a certain type of random feature mapping. Moreover, we determine how large $N$ needs to be as a function of $\epsilon$ and other parameters of the point set. Compared to existing techniques, our maps offer several advantages. The standard method for compressing point sets by random mappings relies on the Johnson-Lindenstrauss lemma which implies that if a set of $n$ points is mapped by a Gaussian random matrix to $\mathbb{R}^k$ with $k =\Theta(\epsilon^{-2}\log n)$ , then pairwise distances between points are preserved up to a multiplicative $(1\pm \epsilon)$ error with high probability. The main advantage of our maps $\varphi_\ell$ over random linear maps is that ours map point sets directly into the discrete cube $N^{-1/2}\{-1,1\}^N$ and so there is no additional step needed to convert the sketch to bits. For some range of parameters, our maps $\varphi_\ell$ produce sketches which require fewer bits of storage space.