Jarzynski Reweighting and Sampling Dynamics for Training Energy-Based Models: Theoretical Analysis of Different Transition Kernels
Davide Carbone

TL;DR
This paper explores the application of Jarzynski reweighting to improve training of Energy-Based Models by analyzing kernel choices and their effects on sampling and bias correction in generative frameworks.
Contribution
It provides a theoretical analysis of Jarzynski reweighting for EBMs, connecting it with flow-based diffusion models and RBMs, and highlights its potential to enhance sampling and reduce biases.
Findings
Jarzynski reweighting can mitigate discretization errors in diffusion models.
Kernel choice significantly impacts model performance and bias correction.
Theoretical insights suggest new directions for principled generative training.
Abstract
Energy-Based Models (EBMs) provide a flexible framework for generative modeling, but their training remains theoretically challenging due to the need to approximate normalization constants and efficiently sample from complex, multi-modal distributions. Traditional methods, such as contrastive divergence and score matching, introduce biases that can hinder accurate learning. In this work, we present a theoretical analysis of Jarzynski reweighting, a technique from non-equilibrium statistical mechanics, and its implications for training EBMs. We focus on the role of the choice of the kernel and we illustrate these theoretical considerations in two key generative frameworks: (i) flow-based diffusion models, where we reinterpret Jarzynski reweighting in the context of stochastic interpolants to mitigate discretization errors and improve sample quality, and (ii) Restricted Boltzmann…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Quantum many-body systems
MethodsFocus · Diffusion
