Substitute Based SCODE Word Embeddings in Supervised NLP Tasks
Volkan Cirik, Deniz Yuret

TL;DR
This paper introduces a substitute-based word embedding method that effectively captures contextual similarity, achieving state-of-the-art results in multilingual dependency parsing and competitive performance in other supervised NLP tasks.
Contribution
The paper presents a novel substitute-based embedding approach that outperforms existing embeddings in several supervised NLP tasks, including multilingual dependency parsing.
Findings
Achieves state-of-the-art results in multilingual dependency parsing
Performs as well or better than other embeddings in NER and Chunking
Provides publicly available embeddings for 7 languages
Abstract
We analyze a word embedding method in supervised tasks. It maps words on a sphere such that words co-occurring in similar contexts lie closely. The similarity of contexts is measured by the distribution of substitutes that can fill them. We compared word embeddings, including more recent representations, in Named Entity Recognition (NER), Chunking, and Dependency Parsing. We examine our framework in multilingual dependency parsing as well. The results show that the proposed method achieves as good as or better results compared to the other word embeddings in the tasks we investigate. It achieves state-of-the-art results in multilingual dependency parsing. Word embeddings in 7 languages are available for public use.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
