TL;DR
This paper explores how incorporating subword information, such as n-grams and morphemes, into matrix factorization models improves the quality of word embeddings, especially for rare and OOV words.
Contribution
It is the first to evaluate the impact of subword information on counting models for word embeddings, demonstrating benefits similar to predictive models.
Findings
Subword information enhances embeddings for rare words.
Unsupervised morphemes improve representation quality.
Subword incorporation benefits out-of-vocabulary words.
Abstract
The positive effect of adding subword information to word embeddings has been demonstrated for predictive models. In this paper we investigate whether similar benefits can also be derived from incorporating subwords into counting models. We evaluate the impact of different types of subwords (n-grams and unsupervised morphemes), with results confirming the importance of subword information in learning representations of rare and out-of-vocabulary words.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
