Contextual Skipgram: Training Word Representation Using Context Information
Dongjae Kim, Jong-Kook Kim

TL;DR
This paper introduces Contextual Skip-gram, an improved word embedding model that uses context information to focus on relevant words, leading to better quality representations by reducing noise from irrelevant context words.
Contribution
It proposes a novel extension of the skip-gram model that incorporates context information to improve word embedding quality.
Findings
Enhanced word representations with better semantic accuracy
Reduced influence of irrelevant context words during training
Improved performance on downstream NLP tasks
Abstract
The skip-gram (SG) model learns word representation by predicting the words surrounding a center word from unstructured text data. However, not all words in the context window contribute to the meaning of the center word. For example, less relevant words could be in the context window, hindering the SG model from learning a better quality representation. In this paper, we propose an enhanced version of the SG that leverages context information to produce word representation. The proposed model, Contextual Skip-gram, is designed to predict contextual words with both the center words and the context information. This simple idea helps to reduce the impact of irrelevant words on the training process, thus enhancing the final performance
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
