Guarantees for Tuning the Step Size using a Learning-to-Learn Approach

Xiang Wang; Shuai Yuan; Chenwei Wu; Rong Ge

arXiv:2006.16495·stat.ML·June 14, 2021·6 cites

Guarantees for Tuning the Step Size using a Learning-to-Learn Approach

Xiang Wang, Shuai Yuan, Chenwei Wu, Rong Ge

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper provides theoretical guarantees for tuning step size in optimization using a learning-to-learn approach, addressing issues like meta-gradient explosion and generalization, with empirical validation.

Contribution

It offers the first meta-optimization guarantees for learning-to-learn methods on quadratic problems, analyzing meta-gradient issues and the importance of validation sets.

Findings

01

Naive meta-objectives cause gradient explosion/vanishing.

02

Designing bounded meta-objectives mitigates numerical issues.

03

Meta-gradient computation on validation sets improves generalization.

Abstract

Choosing the right parameters for optimization algorithms is often the key to their success in practice. Solving this problem using a learning-to-learn approach -- using meta-gradient descent on a meta-objective based on the trajectory that the optimizer generates -- was recently shown to be effective. However, the meta-optimization problem is difficult. In particular, the meta-gradient can often explode/vanish, and the learned optimizer may not have good generalization performance if the meta-objective is not chosen carefully. In this paper we give meta-optimization guarantees for the learning-to-learn approach on a simple problem of tuning the step size for quadratic loss. Our results show that the na\"ive objective suffers from meta-gradient explosion/vanishing problem. Although there is a way to design the meta-objective so that the meta-gradient remains polynomially bounded,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Kolin96/learning-to-learn
tfOfficial

Videos

Guarantees for Tuning the Step Size using a Learning-to-Learn Approach· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques · Machine Learning and ELM