Nonlinear Least Squares for Large-Scale Machine Learning using Stochastic Jacobian Estimates
Johannes J. Brust

TL;DR
This paper introduces new algorithms leveraging the low-rank Hessian structure in large-scale nonlinear least squares problems, using stochastic Jacobian estimates to improve search directions in machine learning.
Contribution
It proposes two novel algorithms that estimate Jacobian matrices efficiently, exploiting the low-rank Hessian structure in large-scale nonlinear least squares tasks.
Findings
Algorithms outperform state-of-the-art methods in experiments.
Effective Jacobian estimation reduces computational cost.
Improved convergence in large-scale machine learning tasks.
Abstract
For large nonlinear least squares loss functions in machine learning we exploit the property that the number of model parameters typically exceeds the data in one batch. This implies a low-rank structure in the Hessian of the loss, which enables effective means to compute search directions. Using this property, we develop two algorithms that estimate Jacobian matrices and perform well when compared to state-of-the-art methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Sparse and Compressive Sensing Techniques · Control Systems and Identification
