TL;DR
This paper demonstrates that neural language models like Word2Vec can learn new words from very limited data by leveraging existing semantic knowledge, outperforming current models on definitional tasks.
Contribution
It introduces a simple modification to Word2Vec enabling it to acquire new word vectors from tiny datasets using prior semantic space knowledge.
Findings
Significant performance improvement on definitional tasks.
Effective learning of new words from 2-6 sentences of context.
Outperforms state-of-the-art models in low-data scenarios.
Abstract
Distributional semantics models are known to struggle with small data. It is generally accepted that in order to learn 'a good vector' for a word, a model must have sufficient examples of its usage. This contradicts the fact that humans can guess the meaning of a word from a few occurrences only. In this paper, we show that a neural language model such as Word2Vec only necessitates minor modifications to its standard architecture to learn new terms from tiny data, using background knowledge from a previously learnt semantic space. We test our model on word definitions and on a nonce task involving 2-6 sentences' worth of context, showing a large increase in performance over state-of-the-art models on the definitional task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
