Black-box unadjusted Hamiltonian Monte Carlo

Jakob Robnik; Reuben Cohn-Gordon; Uro\v{s} Seljak

arXiv:2412.08876·stat.CO·May 20, 2025

Black-box unadjusted Hamiltonian Monte Carlo

Jakob Robnik, Reuben Cohn-Gordon, Uro\v{s} Seljak

PDF

Open Access 3 Reviews

TL;DR

This paper introduces an automatic step size tuning method for unadjusted Hamiltonian Monte Carlo that balances bias and variance, enabling efficient high-dimensional sampling without Metropolis-Hastings adjustments.

Contribution

The authors develop a novel automatic tuning scheme for unadjusted Hamiltonian Monte Carlo based on energy error and bias control, applicable beyond Gaussian distributions.

Findings

01

The tuning scheme effectively bounds asymptotic bias in Gaussian cases.

02

Unadjusted methods with tuning outperform adjusted counterparts in high-dimensional problems.

03

The approach is validated on typical Bayesian inference tasks.

Abstract

Hamiltonian Monte Carlo and underdamped Langevin Monte Carlo are state-of-the-art methods for taking samples from high-dimensional distributions with a differentiable density function. To generate samples, they numerically integrate Hamiltonian or Langevin dynamics. This numerical integration introduces an asymptotic bias in Monte Carlo estimators of expectation values, which can be eliminated by adjusting the dynamics with a Metropolis-Hastings (MH) proposal step. Alternatively, one can trade bias for variance by avoiding MH, and select an integration step size that ensures sufficiently small asymptotic bias, relative to the variance inherent in a finite set of samples. Such unadjusted methods often significantly outperform their adjusted counterparts in high-dimensional problems where sampling would otherwise be prohibitively expensive, yet are rarely used in statistical applications…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 8Confidence 3

Strengths

- The proposed criterion EEVPD is intuitive and simple to compute online. Furthermore it is backed with some theory in a Gaussian setting and for the more general case the authors show experimentally that a function of EEVPD upperbounds $b_cov$, suggesting some generality to the result proved in theorem 4.2 - The paper contains broad benchmarks with high‑dimensional case and the method is winning against NUTS/adjusted baselines at fixed error. - Overall I found the paper to be well-written, wit

Weaknesses

The only weakness of this paper is the scope of the theoretical results. It would have certainly been better to have more general results (at least under log-concavity).

Reviewer 02Rating 2Confidence 4

Strengths

The paper addresses an important, timely problem relevant to the ICLR audience: selecting the step size in MCMC algorithms. This challenge has broad applications across machine learning and computational statistics. I ike the idea of monitoring the expected change of the Hamiltonian.

Weaknesses

### 1. Theoretical Contributions The theoretical contributions of the paper are simple and of limited interest: The justification for minimizing EEVPD is unconvincing: - There is no clear evidence that EEVPD can be easily approximated in practice. - In non-Gaussian settings, there is no reason to assume that EEVPD controls the Wasserstein error. - Even if a function $\phi$ exists such that an EEVPD smaller than $\varepsilon$ implies a Wasserstein error smaller than $\phi(\varepsilon)$, the

Reviewer 03Rating 6Confidence 3

Strengths

1. The proposed method has strong theoretical fundation for Gaussian distribution. It also provides empirical results to show the effectiveness on non-Gaussian distributions. 2. The numerical results show that the proposed approach has achieved near-optimal RMSE with orders-of-magnitude fewer gradient evaluations than adjusted (MH) methods. It also outperform NUTS and manually tuned adjusted LMC/HMC. 3. Open-source codes are provided.

Weaknesses

-The main weakness of this paper is the lack of theoretical justification for non-Gaussian cases. The reviewer understand that this paper still provides enough novel contribution even without theoretical justification about non-Gaussian cases. However, it will be great if the authors can provides enough explaination/clarification regarding when this framework will work or not work for non-Gaussian cases.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMarkov Chains and Monte Carlo Methods · Theoretical and Computational Physics · Statistical Mechanics and Entropy