How to train your MAML

Antreas Antoniou; Harrison Edwards; Amos Storkey

arXiv:1810.09502·cs.LG·March 7, 2019·83 cites

How to train your MAML

Antreas Antoniou, Harrison Edwards, Amos Storkey

PDF

Open Access 5 Repos

TL;DR

This paper introduces MAML++, a set of modifications to the original MAML algorithm that enhance stability, generalization, and efficiency in few-shot learning tasks.

Contribution

The paper proposes MAML++, a series of improvements to MAML that address its instability, hyperparameter sensitivity, and computational costs, leading to better performance.

Findings

01

MAML++ improves training stability and convergence speed.

02

MAML++ enhances generalization performance on few-shot tasks.

03

MAML++ reduces computational overhead during training and inference.

Abstract

The field of few-shot learning has recently seen substantial advancements. Most of these advancements came from casting few-shot learning as a meta-learning problem. Model Agnostic Meta Learning or MAML is currently one of the best approaches for few-shot learning via meta-learning. MAML is simple, elegant and very powerful, however, it has a variety of issues, such as being very sensitive to neural network architectures, often leading to instability during training, requiring arduous hyperparameter searches to stabilize training and achieve high generalization and being very computationally expensive at both training and inference times. In this paper, we propose various modifications to MAML that not only stabilize the system, but also substantially improve the generalization performance, convergence speed and computational overhead of MAML, which we call MAML++.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Advanced Neural Network Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Model-Agnostic Meta-Learning