Generalizing Hamiltonian Monte Carlo with Neural Networks
Daniel Levy, Matthew D. Hoffman, Jascha Sohl-Dickstein

TL;DR
This paper introduces a neural network-based extension of Hamiltonian Monte Carlo that significantly improves sampling efficiency and mixing speed across various challenging distributions and real-world tasks.
Contribution
It proposes a novel neural network parameterization for MCMC kernels that generalizes HMC and maximizes mixing speed, with demonstrated empirical improvements.
Findings
Achieved up to 106x increase in effective sample size.
Enabled mixing where standard HMC fails.
Showed benefits on real-world latent-variable modeling.
Abstract
We present a general-purpose method to train Markov chain Monte Carlo kernels, parameterized by deep neural networks, that converge and mix quickly to their target distribution. Our method generalizes Hamiltonian Monte Carlo and is trained to maximize expected squared jumped distance, a proxy for mixing speed. We demonstrate large empirical gains on a collection of simple but challenging distributions, for instance achieving a 106x improvement in effective sample size in one case, and mixing when standard HMC makes no measurable progress in a second. Finally, we show quantitative and qualitative gains on a real-world task: latent-variable generative modeling. We release an open source TensorFlow implementation of the algorithm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Generative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference
