An adaptive Hessian approximated stochastic gradient MCMC method
Yating Wang, Wei Deng, Guang Lin

TL;DR
This paper introduces an adaptive Hessian-based preconditioning technique for stochastic gradient MCMC methods, improving sampling efficiency in high-dimensional Bayesian neural network training by incorporating local geometric information.
Contribution
It proposes a novel adaptive Hessian approximation approach that enhances SG-MCMC efficiency and scalability through limited memory, sparsity, and theoretical bias control.
Findings
Improved sampling efficiency in high-dimensional models
Reduced computational complexity via sparse Hessian approximation
Theoretical analysis confirms controllable bias in the method
Abstract
Bayesian approaches have been successfully integrated into training deep neural networks. One popular family is stochastic gradient Markov chain Monte Carlo methods (SG-MCMC), which have gained increasing interest due to their scalability to handle large datasets and the ability to avoid overfitting. Although standard SG-MCMC methods have shown great performance in a variety of problems, they may be inefficient when the random variables in the target posterior densities have scale differences or are highly correlated. In this work, we present an adaptive Hessian approximated stochastic gradient MCMC method to incorporate local geometric information while sampling from the posterior. The idea is to apply stochastic approximation to sequentially update a preconditioning matrix at each iteration. The preconditioner possesses second-order information and can guide the random walk of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning
