Approximating Hessian matrices using Bayesian inference: a new approach   for quasi-Newton methods in stochastic optimization

Andre Carlon; Luis Espath; Raul Tempone

arXiv:2208.00441·math.OC·April 2, 2024·Optim. Methods Softw.

Approximating Hessian matrices using Bayesian inference: a new approach for quasi-Newton methods in stochastic optimization

Andre Carlon, Luis Espath, Raul Tempone

PDF

Open Access 1 Repo

TL;DR

This paper introduces a Bayesian method to approximate Hessian matrices in stochastic optimization, improving convergence of stochastic gradient descent without amplifying noise, especially for ill-conditioned problems.

Contribution

A novel Bayesian approach for Hessian approximation that minimizes secant residue and maintains eigenvalue bounds, enhancing stochastic optimization convergence.

Findings

01

Improves convergence of stochastic gradient descent.

02

Pre-conditioning with the inverse Hessian is more effective for ill-conditioned problems.

03

Outperforms traditional methods on quadratic and logistic regression tasks.

Abstract

Using quasi-Newton methods in stochastic optimization is not a trivial task given the difficulty of extracting curvature information from the noisy gradients. Moreover, pre-conditioning noisy gradient observations tend to amplify the noise. We propose a Bayesian approach to obtain a Hessian matrix approximation for stochastic optimization that minimizes the secant equations residue while retaining the extreme eigenvalues between a specified range. Thus, the proposed approach assists stochastic gradient descent to converge to local minima without augmenting gradient noise. We propose maximizing the log posterior using the Newton-CG method. Numerical results on a stochastic quadratic function and an $ℓ_{2}$ -regularized logistic regression problem are presented. In all the cases tested, our approach improves the convergence of stochastic gradient descent, compensating for the overhead of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

agcarlon/bayhess
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical and numerical algorithms · Advanced Optimization Algorithms Research · Gaussian Processes and Bayesian Inference