Guarantees for Tuning the Step Size using a Learning-to-Learn Approach
Xiang Wang, Shuai Yuan, Chenwei Wu, Rong Ge

TL;DR
This paper provides theoretical guarantees for tuning step size in optimization using a learning-to-learn approach, addressing issues like meta-gradient explosion and generalization, with empirical validation.
Contribution
It offers the first meta-optimization guarantees for learning-to-learn methods on quadratic problems, analyzing meta-gradient issues and the importance of validation sets.
Findings
Naive meta-objectives cause gradient explosion/vanishing.
Designing bounded meta-objectives mitigates numerical issues.
Meta-gradient computation on validation sets improves generalization.
Abstract
Choosing the right parameters for optimization algorithms is often the key to their success in practice. Solving this problem using a learning-to-learn approach -- using meta-gradient descent on a meta-objective based on the trajectory that the optimizer generates -- was recently shown to be effective. However, the meta-optimization problem is difficult. In particular, the meta-gradient can often explode/vanish, and the learned optimizer may not have good generalization performance if the meta-objective is not chosen carefully. In this paper we give meta-optimization guarantees for the learning-to-learn approach on a simple problem of tuning the step size for quadratic loss. Our results show that the na\"ive objective suffers from meta-gradient explosion/vanishing problem. Although there is a way to design the meta-objective so that the meta-gradient remains polynomially bounded,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques · Machine Learning and ELM
