Learning to Optimize Quasi-Newton Methods
Isaac Liao, Rumen R. Dangovski, Jakob N. Foerster, Marin, Solja\v{c}i\'c

TL;DR
LODO is a novel meta-learning optimizer that dynamically learns preconditioners during training, combining L2O and quasi-Newton methods to adapt to the loss landscape and improve optimization efficiency.
Contribution
The paper introduces LODO, a meta-learning optimizer that learns preconditioners on the fly without prior meta-training, merging L2O with quasi-Newton techniques for flexible inverse Hessian approximation.
Findings
LODO effectively optimizes in noisy loss landscapes.
Simpler inverse Hessian representations reduce performance.
LODO trains a neural network with 95k parameters efficiently.
Abstract
Fast gradient-based optimization algorithms have become increasingly essential for the computationally efficient training of machine learning models. One technique is to multiply the gradient by a preconditioner matrix to produce a step, but it is unclear what the best preconditioner matrix is. This paper introduces a novel machine learning optimizer called LODO, which tries to online meta-learn the best preconditioner during optimization. Specifically, our optimizer merges Learning to Optimize (L2O) techniques with quasi-Newton methods to learn preconditioners parameterized as neural networks; they are more flexible than preconditioners in other quasi-Newton methods. Unlike other L2O methods, LODO does not require any meta-training on a training task distribution, and instead learns to optimize on the fly while optimizing on the test task, adapting to the local characteristics of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Machine Learning and ELM
MethodsTest
