Discriminative Bayesian filtering lends momentum to the stochastic   Newton method for minimizing log-convex functions

Michael C. Burkhart

arXiv:2104.12949·stat.ML·August 22, 2023

Discriminative Bayesian filtering lends momentum to the stochastic Newton method for minimizing log-convex functions

Michael C. Burkhart

PDF

1 Repo

TL;DR

This paper introduces a Bayesian filtering approach to stochastic Newton optimization for log-convex functions, leveraging historical gradient and Hessian data to improve convergence, with theoretical conditions for diminishing influence of past observations.

Contribution

It presents a novel optimization algorithm based on Bayesian filtering that incorporates the entire history of gradients and Hessians, enhancing stochastic Newton methods.

Findings

01

Matrix conditions for diminishing influence of past data

02

Enhanced convergence properties of the proposed method

03

Comparison with existing stochastic Newton techniques

Abstract

To minimize the average of a set of log-convex functions, the stochastic Newton method iteratively updates its estimate using subsampled versions of the full objective's gradient and Hessian. We contextualize this optimization problem as sequential Bayesian inference on a latent state-space model with a discriminatively-specified observation process. Applying Bayesian filtering then yields a novel optimization algorithm that considers the entire history of gradients and Hessians when forming an update. We establish matrix-based conditions under which the effect of older observations diminishes over time, in a manner analogous to Polyak's heavy ball momentum. We illustrate various aspects of our approach with an example and review other relevant innovations for the stochastic Newton method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

burkh4rt/filtered-stochastic-newton
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.