First and zeroth-order implementations of the regularized Newton method with lazy approximated Hessians
Nikita Doikov, Geovani Nunes Grapiglia

TL;DR
This paper introduces first- and zeroth-order implementations of the regularized Newton method with adaptive parameter fitting and lazy Hessian updates, achieving improved complexity bounds for non-convex optimization.
Contribution
It presents novel Hessian-free and derivative-free algorithms with adaptive regularization and lazy Hessian updates, providing better complexity bounds for non-convex problems.
Findings
Global complexity bound of O(n^{1/2} ε^{-3/2}) for Hessian-free method
Global complexity bound of O(n^{3/2} ε^{-3/2}) for derivative-free method
Algorithms do not require knowledge of Lipschitz constants
Abstract
In this work, we develop first-order (Hessian-free) and zero-order (derivative-free) implementations of the Cubically regularized Newton method for solving general non-convex optimization problems. For that, we employ finite difference approximations of the derivatives. We use a special adaptive search procedure in our algorithms, which simultaneously fits both the regularization constant and the parameters of the finite difference approximations. It makes our schemes free from the need to know the actual Lipschitz constants. Additionally, we equip our algorithms with the lazy Hessian update that reuse a previously computed Hessian approximation matrix for several iterations. Specifically, we prove the global complexity bound of function and gradient evaluations for our new Hessian-free method, and a bound of $\mathcal{O}( n^{3/2} \epsilon^{-3/2}…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Optimization Algorithms Research · Sparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques
