Learning to Optimize Neural Nets
Ke Li, Jitendra Malik

TL;DR
This paper introduces a reinforcement learning-based framework to learn optimization algorithms specifically for training shallow neural networks, demonstrating superior generalization and robustness across various datasets and architectures.
Contribution
It develops an extension of reinforcement learning methods tailored for high-dimensional stochastic optimization, outperforming traditional algorithms on multiple unseen tasks.
Findings
Learned optimizer outperforms known algorithms on unseen datasets.
The optimizer generalizes across different neural network architectures.
It remains robust to variations in gradient stochasticity.
Abstract
Learning to Optimize is a recently proposed framework for learning optimization algorithms using reinforcement learning. In this paper, we explore learning an optimization algorithm for training shallow neural nets. Such high-dimensional stochastic optimization problems present interesting challenges for existing reinforcement learning algorithms. We develop an extension that is suited to learning optimization algorithms in this setting and demonstrate that the learned optimization algorithm consistently outperforms other known optimization algorithms even on unseen tasks and is robust to changes in stochasticity of gradients and the neural net architecture. More specifically, we show that an optimization algorithm trained with the proposed method on the problem of training a neural net on MNIST generalizes to the problems of training neural nets on the Toronto Faces Dataset, CIFAR-10…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Advanced Neural Network Applications
