High-risk learning: acquiring new word vectors from tiny data

Aurelie Herbelot; Marco Baroni

arXiv:1707.06556·cs.CL·July 21, 2017

High-risk learning: acquiring new word vectors from tiny data

Aurelie Herbelot, Marco Baroni

PDF

1 Repo

TL;DR

This paper demonstrates that neural language models like Word2Vec can learn new words from very limited data by leveraging existing semantic knowledge, outperforming current models on definitional tasks.

Contribution

It introduces a simple modification to Word2Vec enabling it to acquire new word vectors from tiny datasets using prior semantic space knowledge.

Findings

01

Significant performance improvement on definitional tasks.

02

Effective learning of new words from 2-6 sentences of context.

03

Outperforms state-of-the-art models in low-data scenarios.

Abstract

Distributional semantics models are known to struggle with small data. It is generally accepted that in order to learn 'a good vector' for a word, a model must have sufficient examples of its usage. This contradicts the fact that humans can guess the meaning of a word from a few occurrences only. In this paper, we show that a neural language model such as Word2Vec only necessitates minor modifications to its standard architecture to learn new terms from tiny data, using background knowledge from a previously learnt semantic space. We test our model on word definitions and on a nonce task involving 2-6 sentences' worth of context, showing a large increase in performance over state-of-the-art models on the definitional task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

minimalparts/nonce2vec
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.