A Novel Fast Exact Subproblem Solver for Stochastic Quasi-Newton Cubic Regularized Optimization
Jarad Forristal, Joshua Griffin, Wenwen Zhou, Seyedalireza Yektamaram

TL;DR
This paper introduces a fast, exact solver for cubic regularized subproblems in stochastic quasi-Newton optimization, enabling scalable second-order methods with competitive performance on deep neural networks.
Contribution
It presents a novel matrix-free, exact subproblem solver for LQN-based ARC methods, improving speed and scalability in large-scale nonconvex optimization.
Findings
Substantial speed-ups over traditional methods.
Competitive performance with Adam on DNNs.
Minimal tuning required for the proposed optimizer.
Abstract
In this work we describe an Adaptive Regularization using Cubics (ARC) method for large-scale nonconvex unconstrained optimization using Limited-memory Quasi-Newton (LQN) matrices. ARC methods are a relatively new family of optimization strategies that utilize a cubic-regularization (CR) term in place of trust-regions and line-searches. LQN methods offer a large-scale alternative to using explicit second-order information by taking identical inputs to those used by popular first-order methods such as stochastic gradient descent (SGD). Solving the CR subproblem exactly requires Newton's method, yet using properties of the internal structure of LQN matrices, we are able to find exact solutions to the CR subproblem in a matrix-free manner, providing large speedups and scaling into modern size requirements. Additionally, we expand upon previous ARC work and explicitly incorporate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and ELM
MethodsAdam
