Stateless Neural Meta-Learning using Second-Order Gradients

Mike Huisman; Aske Plaat; Jan N. van Rijn

arXiv:2104.10527·cs.LG·November 8, 2022

Stateless Neural Meta-Learning using Second-Order Gradients

Mike Huisman, Aske Plaat, Jan N. van Rijn

PDF

Open Access 1 Repo

TL;DR

This paper introduces TURTLE, a new meta-learning algorithm that uses second-order gradients to outperform existing methods like MAML and LSTM-based meta-learners in few-shot learning tasks, with comparable computational cost.

Contribution

The paper formally shows that the meta-learner LSTM subsumes MAML and proposes TURTLE, a simpler yet more expressive algorithm leveraging second-order gradients for improved performance.

Findings

01

TURTLE outperforms MAML and LSTM meta-learners in few-shot tasks.

02

Second-order gradients significantly boost meta-learner performance.

03

TURTLE achieves superior accuracy without additional hyperparameter tuning.

Abstract

Deep learning typically requires large data sets and much compute power for each new problem that is learned. Meta-learning can be used to learn a good prior that facilitates quick learning, thereby relaxing these requirements so that new tasks can be learned quicker; two popular approaches are MAML and the meta-learner LSTM. In this work, we compare the two and formally show that the meta-learner LSTM subsumes MAML. Combining this insight with recent empirical findings, we construct a new algorithm (dubbed TURTLE) which is simpler than the meta-learner LSTM yet more expressive than MAML. TURTLE outperforms both techniques at few-shot sine wave regression and image classification on miniImageNet and CUB without any additional hyperparameter tuning, at a computational cost that is comparable with second-order MAML. The key to TURTLE's success lies in the use of second-order gradients,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mikehuisman/revisiting-learned-optimizers
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Neural Networks and Applications

MethodsTanh Activation · Sigmoid Activation · Model-Agnostic Meta-Learning · Long Short-Term Memory