Adiabatic Persistent Contrastive Divergence Learning
Hyeryung Jang, Hyungwon Choi, Yung Yi, Jinwoo Shin

TL;DR
This paper introduces an efficient, convergent learning algorithm for probabilistic graphical models with latent variables, using a novel multi-time-scale stochastic approximation approach with incomplete Markov Chain sampling.
Contribution
It proposes a new algorithm that ensures convergence by running few Markov Chain cycles in both E and M steps, extending contrastive divergence methods with exact gradient computation.
Findings
The algorithm guarantees convergence to the correct optimum.
Hybrid approach improves performance over mean-field CD in experiments.
Effective in real-world datasets despite potential slow mixing issues.
Abstract
This paper studies the problem of parameter learning in probabilistic graphical models having latent variables, where the standard approach is the expectation maximization algorithm alternating expectation (E) and maximization (M) steps. However, both E and M steps are computationally intractable for high dimensional data, while the substitution of one step to a faster surrogate for combating against intractability can often cause failure in convergence. We propose a new learning algorithm which is computationally efficient and provably ensures convergence to a correct optimum. Its key idea is to run only a few cycles of Markov Chains (MC) in both E and M steps. Such an idea of running incomplete MC has been well studied only for M step in the literature, called Contrastive Divergence (CD) learning. While such known CD-based schemes find approximated gradients of the log-likelihood via…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Generative Adversarial Networks and Image Synthesis · Bayesian Methods and Mixture Models
