NeurIPS2021

Testing Probabilistic Circuits

Yash Pote, Kuldeep S. Meel

10 citations

Abstract

Probabilistic circuits (PCs) are a powerful modeling framework for representing tractable probability distributions over combinatorial spaces. In machine learning and probabilistic programming, one is often interested in understanding whether the distributions learned using PCs are close to the desired distribution. Thus, given two probabilistic circuits, a fundamental problem of interest is to determine whether their distributions are close to each other. The primary contribution of this paper is a closeness test for PCs with respect to the total variation distance metric. Our algorithm utilizes two common PC queries, counting and sampling. In particular, we provide a poly-time probabilistic algorithm to check the closeness of two PCs, when the PCs support tractable approximate counting and sampling. We demonstrate the practical efficiency of our algorithmic framework via a detailed experimental evaluation of a prototype implementation against a set of 475 PC benchmarks. We find that our test correctly decides the closeness of all 475 PCs within 3600 seconds. * The accompanying tool, available open source, can be found at https://github.com/meelgroup/teq . The Appendix is available in the accompanying supplementary material. † The authors decided to forgo the old convention of alphabetical ordering of authors in favor of a randomized ordering, denoted by r . The publicly verifiable record of the randomization is available at https://www.aeaweb.org/journals/policies/random-author-order/search with confirmation code: icoxrj0sNB2L. For citation of the work, authors request that the citation guidelines by AEA for random author ordering be followed. 35th Conference on Neural Information Processing Systems (NeurIPS 2021). encode non-identical distributions that are nonetheless close enough for practical purposes. Such situations may arise due to the use of approximate PC compilation [10] and sampling-based learning of PCs [26, 27] . As a concrete example, consider PCs that are learned via approximate methods such as stochastic gradient descent [27] . In such a case, PCs are likely to converge to close but nonidentical distributions. Given two such PCs, we would like to know whether they have converged to distributions close to each other. Thus, we raise the question: Does there exist an efficient algorithm to test the closeness of two PC distributions? In this work, we design the first closeness test for PCs with respect to TV distance, called Teq. Assuming the tested PCs allow poly-time approximate weighted model counting and sampling, Teq runs in polynomial time. Formally, given two PC distributions P and Q, and three parameters (ε,η,δ), for closeness(ε), farness(η), and tolerance(δ), Teq returns Accept if d T V (P, Q) ≤ ε and Reject if d T V (P, Q) ≥ η with probability at least 1 -δ. Teq makes atmost O((η -ε) -2 log(δ -1 )) calls to the sampler and exactly 2 calls to the counter.