Evaluation of Morphological Embeddings for the Russian Language
Vitaly Romanov, Albina Khusainova

TL;DR
This study evaluates the effectiveness of morphology-based word embeddings for Russian, revealing that they do not outperform FastText and that BERT performs better on morphology-dependent NLP tasks.
Contribution
It provides an empirical comparison of morphology-based embeddings and BERT for Russian NLP tasks, highlighting the limited benefits of morphology-based embeddings.
Findings
Morphology-based embeddings do not outperform FastText.
BERT significantly outperforms morphology-based embeddings on NLP tasks.
Morphology-aware models do not necessarily improve performance on Russian NLP tasks.
Abstract
A number of morphology-based word embedding models were introduced in recent years. However, their evaluation was mostly limited to English, which is known to be a morphologically simple language. In this paper, we explore whether and to what extent incorporating morphology into word embeddings improves performance on downstream NLP tasks, in the case of morphologically rich Russian language. NLP tasks of our choice are POS tagging, Chunking, and NER -- for Russian language, all can be mostly solved using only morphology without understanding the semantics of words. Our experiments show that morphology-based embeddings trained with Skipgram objective do not outperform existing embedding model -- FastText. Moreover, a more complex, but morphology unaware model, BERT, allows to achieve significantly greater performance on the tasks that presumably require understanding of a word's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Residual Connection · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay · Weight Decay · Multi-Head Attention · Dense Connections · Softmax · Layer Normalization · Attention Dropout
