An adaptive Hessian approximated stochastic gradient MCMC method

Yating Wang; Wei Deng; Guang Lin

arXiv:2010.01384·math.NA·March 17, 2021

An adaptive Hessian approximated stochastic gradient MCMC method

Yating Wang, Wei Deng, Guang Lin

PDF

TL;DR

This paper introduces an adaptive Hessian-based preconditioning technique for stochastic gradient MCMC methods, improving sampling efficiency in high-dimensional Bayesian neural network training by incorporating local geometric information.

Contribution

It proposes a novel adaptive Hessian approximation approach that enhances SG-MCMC efficiency and scalability through limited memory, sparsity, and theoretical bias control.

Findings

01

Improved sampling efficiency in high-dimensional models

02

Reduced computational complexity via sparse Hessian approximation

03

Theoretical analysis confirms controllable bias in the method

Abstract

Bayesian approaches have been successfully integrated into training deep neural networks. One popular family is stochastic gradient Markov chain Monte Carlo methods (SG-MCMC), which have gained increasing interest due to their scalability to handle large datasets and the ability to avoid overfitting. Although standard SG-MCMC methods have shown great performance in a variety of problems, they may be inefficient when the random variables in the target posterior densities have scale differences or are highly correlated. In this work, we present an adaptive Hessian approximated stochastic gradient MCMC method to incorporate local geometric information while sampling from the posterior. The idea is to apply stochastic approximation to sequentially update a preconditioning matrix at each iteration. The preconditioner possesses second-order information and can guide the random walk of a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning