A Simple Neural Attentive Meta-Learner

Nikhil Mishra; Mostafa Rohaninejad; Xi Chen; Pieter Abbeel

arXiv:1707.03141·cs.AI·February 27, 2018·759 cites

A Simple Neural Attentive Meta-Learner

Nikhil Mishra, Mostafa Rohaninejad, Xi Chen, Pieter Abbeel

PDF

Open Access 4 Repos

TL;DR

This paper introduces SNAIL, a simple, generic neural meta-learner architecture combining temporal convolutions and soft attention, achieving state-of-the-art results across various tasks in supervised and reinforcement learning.

Contribution

The paper presents a novel meta-learner architecture that is simple, generic, and effective, avoiding extensive hand-design and outperforming previous methods.

Findings

01

SNAIL achieves state-of-the-art performance on multiple benchmarks.

02

SNAIL outperforms existing meta-learning approaches significantly.

03

The architecture is effective in both supervised and reinforcement learning tasks.

Abstract

Deep neural networks excel in regimes with large amounts of data, but tend to struggle when data is scarce or when they need to adapt quickly to changes in the task. In response, recent work in meta-learning proposes training a meta-learner on a distribution of similar tasks, in the hopes of generalization to novel but related tasks by learning a high-level strategy that captures the essence of the problem it is asked to solve. However, many recent meta-learning approaches are extensively hand-designed, either using architectures specialized to a particular application, or hard-coding algorithmic components that constrain how the meta-learner solves the task. We propose a class of simple and generic meta-learner architectures that use a novel combination of temporal convolutions and soft attention; the former to aggregate information from past experience and the latter to pinpoint…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsSoftmax · Dilated Causal Convolution · Attention Is All You Need · Simple Neural Attention Meta-Learner