An Ensemble Method for Producing Word Representations focusing on the Greek Language
Michalis Lioudakis, Stamatis Outsios, Michalis Vazirgiannis

TL;DR
This paper introduces CBOS, an ensemble method combining CBOW and Skip-gram, to generate high-quality Greek word representations, demonstrating state-of-the-art results across multiple datasets and tasks.
Contribution
The paper proposes a novel ensemble approach, CBOS, specifically tailored for Greek, improving word representation quality over existing methods.
Findings
CBOS outperforms traditional methods in intrinsic evaluations.
CBOS achieves superior results in extrinsic NLP tasks.
The method is effective across multiple Greek language datasets.
Abstract
In this paper we present a new ensemble method, Continuous Bag-of-Skip-grams (CBOS), that produces high-quality word representations putting emphasis on the modern Greek language. The CBOS method combines the pioneering approaches for learning word representations: Continuous Bag-of-Words (CBOW) and Continuous Skip-gram. These methods are compared through intrinsic and extrinsic evaluation tasks on three different sources of data: the English Wikipedia corpus, the modern Greek Wikipedia corpus, and the modern Greek Web Content corpus. By comparing these methods across different tasks and datasets, it is evident that the CBOS method achieves state-of-the-art performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text and Document Classification Technologies
