Nonlinear Least Squares for Large-Scale Machine Learning using   Stochastic Jacobian Estimates

Johannes J. Brust

arXiv:2107.05598·cs.LG·July 13, 2021

Nonlinear Least Squares for Large-Scale Machine Learning using Stochastic Jacobian Estimates

Johannes J. Brust

PDF

Open Access 1 Repo

TL;DR

This paper introduces new algorithms leveraging the low-rank Hessian structure in large-scale nonlinear least squares problems, using stochastic Jacobian estimates to improve search directions in machine learning.

Contribution

It proposes two novel algorithms that estimate Jacobian matrices efficiently, exploiting the low-rank Hessian structure in large-scale nonlinear least squares tasks.

Findings

01

Algorithms outperform state-of-the-art methods in experiments.

02

Effective Jacobian estimation reduces computational cost.

03

Improved convergence in large-scale machine learning tasks.

Abstract

For large nonlinear least squares loss functions in machine learning we exploit the property that the number of model parameters typically exceeds the data in one batch. This implies a low-rank structure in the Hessian of the loss, which enables effective means to compute search directions. Using this property, we develop two algorithms that estimate Jacobian matrices and perform well when compared to state-of-the-art methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

johannesbrust/SNLLS
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Sparse and Compressive Sensing Techniques · Control Systems and Identification