ICML2023

Data Structures for Density Estimation

Anders Aamand, Alexandr Andoni, Justin Y. Chen, Piotr Indyk, Shyam Narayanan, Sandeep Silwal

被引用 6 次

摘要

We study statistical/computational tradeoffs for the following density estimation problem: given kk distributions v1,,vkv_1, \ldots, v_k over a discrete domain of size nn, and sampling access to a distribution pp, identify viv_i that is"close"to pp. Our main result is the first data structure that, given a sublinear (in nn) number of samples from pp, identifies viv_i in time sublinear in kk. We also give an improved version of the algorithm of Acharya et al. (2018) that reports viv_i in time linear in kk. The experimental evaluation of the latter algorithm shows that it achieves a significant reduction in the number of operations needed to achieve a given accuracy compared to prior work.