Efficient Training of Energy-Based Models Using Jarzynski Equality
Davide Carbone, Mengjian Hua, Simon Coste, Eric Vanden-Eijnden

TL;DR
This paper introduces a novel method for training energy-based models efficiently by leveraging Jarzynski equality and sequential Monte Carlo techniques, outperforming traditional contrastive divergence approaches.
Contribution
It presents a new approach combining Jarzynski equality with modified Langevin dynamics to accurately compute gradients without sampling biases.
Findings
Outperforms contrastive divergence in experiments
Effective on Gaussian mixtures and MNIST datasets
Reduces sampling bias during training
Abstract
Energy-based models (EBMs) are generative models inspired by statistical physics with a wide range of applications in unsupervised learning. Their performance is best measured by the cross-entropy (CE) of the model distribution relative to the data distribution. Using the CE as the objective for training is however challenging because the computation of its gradient with respect to the model parameters requires sampling the model distribution. Here we show how results for nonequilibrium thermodynamics based on Jarzynski equality together with tools from sequential Monte-Carlo sampling can be used to perform this computation efficiently and avoid the uncontrolled approximations made using the standard contrastive divergence algorithm. Specifically, we introduce a modification of the unadjusted Langevin algorithm (ULA) in which each walker acquires a weight that enables the estimation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Statistical Mechanics and Entropy · Model Reduction and Neural Networks
