How to Train Your Energy-Based Models
Yang Song, Diederik P. Kingma

TL;DR
This paper provides a comprehensive introduction to modern training methods for Energy-Based Models, covering MCMC, MCMC-free approaches like Score Matching and Noise Contrastive Estimation, and their theoretical connections.
Contribution
It offers a clear, tutorial-style overview of EBM training techniques, including recent advances and theoretical insights, aimed at researchers and practitioners.
Findings
Explains maximum likelihood training with MCMC.
Details MCMC-free methods such as Score Matching and NCE.
Highlights theoretical relationships among different training approaches.
Abstract
Energy-Based Models (EBMs), also known as non-normalized probabilistic models, specify probability density or mass functions up to an unknown normalizing constant. Unlike most other probabilistic models, EBMs do not place a restriction on the tractability of the normalizing constant, thus are more flexible to parameterize and can model a more expressive family of probability distributions. However, the unknown normalizing constant of EBMs makes training particularly difficult. Our goal is to provide a friendly introduction to modern approaches for EBM training. We start by explaining maximum likelihood training with Markov chain Monte Carlo (MCMC), and proceed to elaborate on MCMC-free approaches, including Score Matching (SM) and Noise Constrastive Estimation (NCE). We highlight theoretical connections among these three approaches, and end with a brief survey on alternative training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEnergy Load and Power Forecasting · Gaussian Processes and Bayesian Inference · Generative Adversarial Networks and Image Synthesis
Methodsenergy-based model
