NeurIPS2024
Near-Optimality of Contrastive Divergence Algorithms
Pierre Glaser, Kevin Han Huang, Arthur Gretton
摘要
We perform a non-asymptotic analysis of the contrastive divergence (CD) algorithm, a training method for unnormalized models. While prior work has established that (for exponential family distributions) the CD iterates asymptotically converge at an rate to the true parameter of the data distribution, we show, under some regularity assumptions, that CD can achieve the parametric rate . Our analysis provides results for various data batching schemes, including the fully online and minibatch ones. We additionally show that CD can be near-optimal, in the sense that its asymptotic variance is close to the Cramér-Rao lower bound.