One-Shot Learning for Language Modelling

Talip Ucar; Adrian Gonzalez-Martin; Matthew Lee; Adrian Daniel Szwarc

arXiv:2007.09679·cs.CL·July 21, 2020

One-Shot Learning for Language Modelling

Talip Ucar, Adrian Gonzalez-Martin, Matthew Lee, Adrian Daniel Szwarc

PDF

Open Access 1 Repo

TL;DR

This paper investigates one-shot learning in NLP by adapting matching networks, comparing various similarity measures, and establishing a benchmark for few-shot word prediction tasks using the WikiText-2 dataset.

Contribution

It explores the effectiveness of different distance metrics in few-shot learning and establishes a publicly available benchmark for future research in NLP.

Findings

01

No single best distance metric for k-shot learning.

02

Performance depends on the number of shots during training.

03

Benchmark dataset for one to three-shot learning in NLP is provided.

Abstract

Humans can infer a great deal about the meaning of a word, using the syntax and semantics of surrounding words even if it is their first time reading or hearing it. We can also generalise the learned concept of the word to new tasks. Despite great progress in achieving human-level performance in certain tasks (Silver et al., 2016), learning from one or few examples remains a key challenge in machine learning, and has not thoroughly been explored in Natural Language Processing (NLP). In this work we tackle the problem of oneshot learning for an NLP task by employing ideas from recent developments in machine learning: embeddings, attention mechanisms (softmax) and similarity measures (cosine, Euclidean, Poincare, and Minkowski). We adapt the framework suggested in matching networks (Vinyals et al., 2016), and explore the effectiveness of the aforementioned methods in one, two and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

adriangonz/statistical-nlp-17
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications