NeurIPS2021

Divergence Frontiers for Generative Models: Sample Complexity, Quantization Effects, and Frontier Integrals

Lang Liu, Krishna Pillutla, Sean Welleck, Sewoong Oh, Yejin Choi, Zaïd Harchaoui

被引用 21 次

摘要

The spectacular success of deep generative models calls for quantitative tools to measure their statistical performance. Divergence frontiers have recently been proposed as an evaluation framework for generative models, due to their ability to measure the quality-diversity trade-off inherent to deep generative modeling. We establish non-asymptotic bounds on the sample complexity of divergence frontiers. We also introduce frontier integrals which provide summary statistics of divergence frontiers. We show how smoothed estimators such as Good-Turing or Krichevsky-Trofimov can overcome the missing mass problem and lead to faster rates of convergence. We illustrate the theoretical results with numerical examples from natural language processing and computer vision. While this framework is mathematically elegant and empirically successful [37, 49] , the statistical properties of divergence frontiers are not well understood. Estimating divergence frontiers from data for large generative models involves two approximations: (a) joint quantization of the model distribution and the target distribution into discrete distributions with quantization level k, and (b) statistical estimation of the divergence frontiers based on the quantized distributions. Djolonga et al. [18] argue that the quantization often introduces a positive bias, making the distributions appear closer than they really are; while a small sample size can result in a pessimistic estimate of the divergence frontier. The latter effect is due to the missing mass of the samples, causing the two distributions to appear farther than they really are because the samples do not cover some parts of the distributions. The first consideration favors a large k, while the second favors a small k. 35th Conference on Neural Information Processing Systems (NeurIPS 2021).