Learning Word Embeddings from Intrinsic and Extrinsic Views
Jifan Chen, Kan Chen, Xipeng Qiu, Qi Zhang, Xuanjing Huang, Zheng, Zhang

TL;DR
This paper proposes a novel method for learning word embeddings by combining intrinsic descriptive information with extrinsic contextual data, improving representation quality especially for rare words.
Contribution
It introduces an integrated approach that leverages both intrinsic and extrinsic information for word embedding learning, addressing limitations of context-only models.
Findings
Enhanced performance on word similarity tasks
Improved results in reverse dictionary and link prediction
Effective in document classification
Abstract
While word embeddings are currently predominant for natural language processing, most of existing models learn them solely from their contexts. However, these context-based word embeddings are limited since not all words' meaning can be learned based on only context. Moreover, it is also difficult to learn the representation of the rare words due to data sparsity problem. In this work, we address these issues by learning the representations of words by integrating their intrinsic (descriptive) and extrinsic (contextual) information. To prove the effectiveness of our model, we evaluate it on four tasks, including word similarity, reverse dictionaries,Wiki link prediction, and document classification. Experiment results show that our model is powerful in both word and document modeling.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
