Why Line Search when you can Plane Search? SO-Friendly Neural Networks allow Per-Iteration Optimization of Learning and Momentum Rates for Every Layer
Betty Shea, Mark Schmidt

TL;DR
This paper introduces SO-friendly neural networks that enable efficient per-iteration optimization of learning and momentum rates using plane search, leading to faster, hyper-parameter insensitive training.
Contribution
The paper presents a new class of SO-friendly networks that allow for cost-effective, precise per-iteration optimization of training hyper-parameters using line and plane search methods.
Findings
Plane search can replace line search with similar asymptotic cost.
Subspace optimization improves layer-specific hyper-parameter tuning.
Experiments show faster, more reliable training insensitive to hyper-parameters.
Abstract
We introduce the class of SO-friendly neural networks, which include several models used in practice including networks with 2 layers of hidden weights where the number of inputs is larger than the number of outputs. SO-friendly networks have the property that performing a precise line search to set the step size on each iteration has the same asymptotic cost during full-batch training as using a fixed learning. Further, for the same cost a planesearch can be used to set both the learning and momentum rate on each step. Even further, SO-friendly networks also allow us to use subspace optimization to set a learning rate and momentum rate for each layer on each iteration. We explore augmenting gradient descent as well as quasi-Newton methods and Adam with line optimization and subspace optimization, and our experiments indicate that this gives fast and reliable ways to train these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMetaheuristic Optimization Algorithms Research · Neural Networks and Applications · Advanced Measurement and Metrology Techniques
MethodsSparse Evolutionary Training · Adam
