Learning Word Embeddings from Intrinsic and Extrinsic Views

Jifan Chen; Kan Chen; Xipeng Qiu; Qi Zhang; Xuanjing Huang; Zheng; Zhang

arXiv:1608.05852·cs.CL·August 23, 2016·2 cites

Learning Word Embeddings from Intrinsic and Extrinsic Views

Jifan Chen, Kan Chen, Xipeng Qiu, Qi Zhang, Xuanjing Huang, Zheng, Zhang

PDF

Open Access

TL;DR

This paper proposes a novel method for learning word embeddings by combining intrinsic descriptive information with extrinsic contextual data, improving representation quality especially for rare words.

Contribution

It introduces an integrated approach that leverages both intrinsic and extrinsic information for word embedding learning, addressing limitations of context-only models.

Findings

01

Enhanced performance on word similarity tasks

02

Improved results in reverse dictionary and link prediction

03

Effective in document classification

Abstract

While word embeddings are currently predominant for natural language processing, most of existing models learn them solely from their contexts. However, these context-based word embeddings are limited since not all words' meaning can be learned based on only context. Moreover, it is also difficult to learn the representation of the rare words due to data sparsity problem. In this work, we address these issues by learning the representations of words by integrating their intrinsic (descriptive) and extrinsic (contextual) information. To prove the effectiveness of our model, we evaluate it on four tasks, including word similarity, reverse dictionaries,Wiki link prediction, and document classification. Experiment results show that our model is powerful in both word and document modeling.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques