ACL2020

On Importance Sampling-Based Evaluation of Latent Language Models

Robert L. Logan IV, Matt Gardner, Sameer Singh

被引用 3 次

摘要

Language models that use additional latent structures (e.g., syntax trees, coreference chains, and knowledge graph links) provide several advantages over traditional language models. However, likelihood-based evaluation of these models is often intractable as it requires marginalizing over the latent space. Existing methods avoid this issue by using importance sampling. Although this approach has asymptotic guarantees, analysis is rarely conducted on the effect of decisions such as sample size, granularity of sample aggregation, and the proposal distribution on the reported estimates. In this paper, we measure the effect these factors have on perplexity estimates for three different latent language models. In addition, we elucidate subtle differences in how importance sampling is applied, which can have substantial effects on the final estimates, as well as provide theoretical results that reinforce the validity of importance sampling for evaluating latent language models.