Learning to Learn without Gradient Descent by Gradient Descent

Yutian Chen; Matthew W. Hoffman; Sergio Gomez Colmenarejo; Misha; Denil; Timothy P. Lillicrap; Matt Botvinick; Nando de Freitas

arXiv:1611.03824·stat.ML·June 13, 2017·162 cites

Learning to Learn without Gradient Descent by Gradient Descent

Yutian Chen, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha, Denil, Timothy P. Lillicrap, Matt Botvinick, Nando de Freitas

PDF

Open Access

TL;DR

This paper introduces learned recurrent neural network optimizers trained via gradient descent that can efficiently optimize a wide range of black-box functions, demonstrating strong transferability and competitive performance against traditional Bayesian methods.

Contribution

It presents a novel approach of training neural network optimizers on synthetic functions that generalize well to diverse black-box optimization tasks.

Findings

01

Learned optimizers transfer effectively across tasks.

02

Optimizers outperform traditional methods in hyper-parameter tuning.

03

They balance exploration and exploitation during optimization.

Abstract

We learn recurrent neural network optimizers trained on simple synthetic functions by gradient descent. We show that these learned optimizers exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-parameter tuning tasks. Up to the training horizon, the learned optimizers learn to trade-off exploration and exploitation, and compare favourably with heavily engineered Bayesian optimization packages for hyper-parameter tuning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHigher Education Learning Practices · Intelligent Tutoring Systems and Adaptive Learning