Learning Energy-based Model via Dual-MCMC Teaching
Jiali Cui, Tian Han

TL;DR
This paper introduces a joint learning framework for energy-based models and generator models, using dual-MCMC teaching to improve sampling efficiency and accuracy in EBM training.
Contribution
It proposes a novel interwoven training method that combines MLE for both models and dual-MCMC teaching to enhance EBM learning.
Findings
Improved EBM training efficiency and accuracy.
Effective integration of generator and EBM via dual-MCMC teaching.
Enhanced sampling quality for energy-based models.
Abstract
This paper studies the fundamental learning problem of the energy-based model (EBM). Learning the EBM can be achieved using the maximum likelihood estimation (MLE), which typically involves the Markov Chain Monte Carlo (MCMC) sampling, such as the Langevin dynamics. However, the noise-initialized Langevin dynamics can be challenging in practice and hard to mix. This motivates the exploration of joint training with the generator model where the generator model serves as a complementary model to bypass MCMC sampling. However, such a method can be less accurate than the MCMC and result in biased EBM learning. While the generator can also serve as an initializer model for better MCMC sampling, its learning can be biased since it only matches the EBM and has no access to empirical training examples. Such biased generator learning may limit the potential of learning the EBM. To address this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Time Series Analysis and Forecasting · Machine Learning in Materials Science
Methodsenergy-based model
