Trust-Region Algorithms for Training Responses: Machine Learning Methods   Using Indefinite Hessian Approximations

Jennifer B. Erway; Joshua Griffin; Roummel F. Marcia; and Riadh Omheni

arXiv:1807.00251·math.NA·May 24, 2019·1 cites

Trust-Region Algorithms for Training Responses: Machine Learning Methods Using Indefinite Hessian Approximations

Jennifer B. Erway, Joshua Griffin, Roummel F. Marcia, and Riadh Omheni

PDF

Open Access

TL;DR

This paper introduces a trust-region quasi-Newton method capable of handling indefinite Hessian approximations for training machine learning models, demonstrating improved results over traditional methods within fixed computational budgets.

Contribution

It proposes a novel trust-region algorithm that accommodates indefinite Hessians, enhancing large-scale ML optimization beyond existing quasi-Newton and Hessian-free approaches.

Findings

01

Outperforms L-BFGS and Hessian-free methods in experiments

02

Achieves better training results within fixed computational time

03

Handles indefinite Hessian approximations effectively

Abstract

Machine learning (ML) problems are often posed as highly nonlinear and nonconvex unconstrained optimization problems. Methods for solving ML problems based on stochastic gradient descent are easily scaled for very large problems but may involve fine-tuning many hyper-parameters. Quasi-Newton approaches based on the limited-memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) update typically do not require manually tuning hyper-parameters but suffer from approximating a potentially indefinite Hessian with a positive-definite matrix. Hessian-free methods leverage the ability to perform Hessian-vector multiplication without needing the entire Hessian matrix, but each iteration's complexity is significantly greater than quasi-Newton methods. In this paper we propose an alternative approach for solving ML problems based on a quasi-Newton trust-region framework for solving large-scale optimization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Bandit Algorithms Research