One-Shot Learning for Language Modelling
Talip Ucar, Adrian Gonzalez-Martin, Matthew Lee, Adrian Daniel Szwarc

TL;DR
This paper investigates one-shot learning in NLP by adapting matching networks, comparing various similarity measures, and establishing a benchmark for few-shot word prediction tasks using the WikiText-2 dataset.
Contribution
It explores the effectiveness of different distance metrics in few-shot learning and establishes a publicly available benchmark for future research in NLP.
Findings
No single best distance metric for k-shot learning.
Performance depends on the number of shots during training.
Benchmark dataset for one to three-shot learning in NLP is provided.
Abstract
Humans can infer a great deal about the meaning of a word, using the syntax and semantics of surrounding words even if it is their first time reading or hearing it. We can also generalise the learned concept of the word to new tasks. Despite great progress in achieving human-level performance in certain tasks (Silver et al., 2016), learning from one or few examples remains a key challenge in machine learning, and has not thoroughly been explored in Natural Language Processing (NLP). In this work we tackle the problem of oneshot learning for an NLP task by employing ideas from recent developments in machine learning: embeddings, attention mechanisms (softmax) and similarity measures (cosine, Euclidean, Poincare, and Minkowski). We adapt the framework suggested in matching networks (Vinyals et al., 2016), and explore the effectiveness of the aforementioned methods in one, two and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
