Approximating Hessian matrices using Bayesian inference: a new approach for quasi-Newton methods in stochastic optimization
Andre Carlon, Luis Espath, Raul Tempone

TL;DR
This paper introduces a Bayesian method to approximate Hessian matrices in stochastic optimization, improving convergence of stochastic gradient descent without amplifying noise, especially for ill-conditioned problems.
Contribution
A novel Bayesian approach for Hessian approximation that minimizes secant residue and maintains eigenvalue bounds, enhancing stochastic optimization convergence.
Findings
Improves convergence of stochastic gradient descent.
Pre-conditioning with the inverse Hessian is more effective for ill-conditioned problems.
Outperforms traditional methods on quadratic and logistic regression tasks.
Abstract
Using quasi-Newton methods in stochastic optimization is not a trivial task given the difficulty of extracting curvature information from the noisy gradients. Moreover, pre-conditioning noisy gradient observations tend to amplify the noise. We propose a Bayesian approach to obtain a Hessian matrix approximation for stochastic optimization that minimizes the secant equations residue while retaining the extreme eigenvalues between a specified range. Thus, the proposed approach assists stochastic gradient descent to converge to local minima without augmenting gradient noise. We propose maximizing the log posterior using the Newton-CG method. Numerical results on a stochastic quadratic function and an -regularized logistic regression problem are presented. In all the cases tested, our approach improves the convergence of stochastic gradient descent, compensating for the overhead of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical and numerical algorithms · Advanced Optimization Algorithms Research · Gaussian Processes and Bayesian Inference
