Few-shot Sequence Learning with Transformers

Lajanugen Logeswaran; Ann Lee; Myle Ott; Honglak Lee; Marc'Aurelio; Ranzato; Arthur Szlam

arXiv:2012.09543·cs.LG·December 18, 2020·5 cites

Few-shot Sequence Learning with Transformers

Lajanugen Logeswaran, Ann Lee, Myle Ott, Honglak Lee, Marc'Aurelio, Ranzato, Arthur Szlam

PDF

Open Access

TL;DR

This paper introduces a simple, efficient Transformer-based method for few-shot sequence learning that optimizes task-specific tokens on the fly without complex architecture modifications.

Contribution

Proposes a novel few-shot learning approach using task tokens in Transformers, avoiding complex modifications and second-order derivatives.

Findings

01

Performs comparably to existing methods in various tasks

02

More computationally efficient than current approaches

03

Utilizes compositional task descriptors to enhance performance

Abstract

Few-shot algorithms aim at learning new tasks provided only a handful of training examples. In this work we investigate few-shot learning in the setting where the data points are sequences of tokens and propose an efficient learning algorithm based on Transformers. In the simplest setting, we append a token to an input sequence which represents the particular task to be undertaken, and show that the embedding of this token can be optimized on the fly given few labeled examples. Our approach does not require complicated changes to the model architecture such as adapter layers nor computing second order derivatives as is currently popular in the meta-learning and few-shot learning literature. We demonstrate our approach on a variety of tasks, and analyze the generalization properties of several model variants and baseline approaches. In particular, we show that compositional task…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Cancer-related molecular mechanisms research