NeurIPS2020

On Testing of Samplers

Kuldeep S. Meel, Yash Pote, Sourav Chakraborty

19 citations

Abstract

Given a set of items F\mathcal{F} and a weight function wt:F(0,1)\mathtt{wt}: \mathcal{F} \mapsto (0,1), the problem of sampling seeks to sample an item proportional to its weight. Sampling is a fundamental problem in machine learning. The daunting computational complexity of sampling with formal guarantees leads designers to propose heuristics-based techniques for which no rigorous theoretical analysis exists to quantify the quality of generated distributions. This poses a challenge in designing a testing methodology to test whether a sampler under test generates samples according to a given distribution. Only recently, Chakraborty and Meel (2019) designed the first scalable verifier, called Barbarik1, for samplers in the special case when the weight function wt\mathtt{wt} is constant, that is, when the sampler is supposed to sample uniformly from F\mathcal{F} . The techniques in Barbarik1, however, fail to handle general weight functions. The primary contribution of this paper is an affirmative answer to the above challenge: motivated by Barbarik1 but using different techniques and analysis, we design Barbarik2 an algorithm to test whether the distribution generated by a sampler is ε\varepsilon-close or η\eta-far from any target distribution. In contrast to black-box sampling techniques that require a number of samples proportional to F|\mathcal{F}| , Barbarik2 requires only O~(tilt(wt,φ)2/η(η6ε)3)\tilde{O}(tilt(\mathtt{wt},\varphi)^2/\eta(\eta - 6\varepsilon)^3) samples, where the tilttilt is the maximum ratio of weights of two satisfying assignments. Barbarik2 can handle any arbitrary weight function. We present a prototype implementation of Barbarik2 and use it to test three state-of-the-art samplers.