The Role of Context Types and Dimensionality in Learning Word Embeddings

Oren Melamud; David McClosky; Siddharth Patwardhan; Mohit Bansal

arXiv:1601.00893·cs.CL·July 20, 2017

The Role of Context Types and Dimensionality in Learning Word Embeddings

Oren Melamud, David McClosky, Siddharth Patwardhan, Mohit Bansal

PDF

TL;DR

This paper evaluates how different context types and embedding dimensions influence skip-gram word embeddings' performance across various NLP tasks, highlighting the importance of tuning and proposing a new weighted context model.

Contribution

It provides the first extensive analysis of context types and dimensionality effects on embeddings and introduces a novel weighted context skip-gram variant.

Findings

01

Intrinsic tasks prefer specific context types and higher dimensions.

02

Extrinsic task performance benefits from tuning and combining embeddings.

03

Weighted context approach improves embedding quality.

Abstract

We provide the first extensive evaluation of how using different types of context to learn skip-gram word embeddings affects performance on a wide range of intrinsic and extrinsic NLP tasks. Our results suggest that while intrinsic tasks tend to exhibit a clear preference to particular types of contexts and higher dimensionality, more careful tuning is required for finding the optimal settings for most of the extrinsic tasks that we considered. Furthermore, for these extrinsic tasks, we find that once the benefit from increasing the embedding dimensionality is mostly exhausted, simple concatenation of word embeddings, learned with different context types, can yield further performance gains. As an additional contribution, we propose a new variant of the skip-gram model that learns word embeddings from weighted contexts of substitute words.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.