Revisiting the Effects of Stochasticity for Hamiltonian Samplers

Giulio Franzese; Dimitrios Milios; Maurizio Filippone; Pietro; Michiardi

arXiv:2106.16200·cs.LG·November 8, 2021

Revisiting the Effects of Stochasticity for Hamiltonian Samplers

Giulio Franzese, Dimitrios Milios, Maurizio Filippone, Pietro, Michiardi

PDF

Open Access

TL;DR

This paper analyzes the impact of stochasticity in Hamiltonian samplers, revealing a convergence bottleneck due to mini-batch gradient noise, supported by theoretical insights and empirical experiments on Bayesian neural networks.

Contribution

It provides a novel analysis of mini-batch effects in Hamiltonian SDEs using differential operator splitting, revising previous results and removing normality assumptions.

Findings

01

The best error rate with mini-batches is O(η^2).

02

Decoupling stochastic components clarifies convergence limits.

03

Empirical validation on Bayesian neural networks supports theoretical claims.

Abstract

We revisit the theoretical properties of Hamiltonian stochastic differential equations (SDES) for Bayesian posterior sampling, and we study the two types of errors that arise from numerical SDE simulation: the discretization error and the error due to noisy gradient estimates in the context of data subsampling. Our main result is a novel analysis for the effect of mini-batches through the lens of differential operator splitting, revising previous literature results. The stochastic component of a Hamiltonian SDE is decoupled from the gradient noise, for which we make no normality assumptions. This leads to the identification of a convergence bottleneck: when considering mini-batches, the best achievable error rate is $O (η^{2})$ , with $η$ being the integrator step size. Our theoretical results are supported by an empirical study on a variety of regression and classification…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMarkov Chains and Monte Carlo Methods · Gaussian Processes and Bayesian Inference · Model Reduction and Neural Networks