TL;DR
This study compares static and contextual word embeddings for Turkish in intrinsic and extrinsic NLP tasks, providing insights into their suitability and creating a public Turkish embedding repository.
Contribution
It is the first comprehensive comparison of static and contextual embeddings specifically for Turkish, including detailed syntactic and semantic analysis.
Findings
Static and contextual models show different strengths in syntactic and semantic tasks.
The study provides a Turkish word embedding repository for future research.
Insights into the suitability of embedding models for various NLP tasks in Turkish.
Abstract
Word embeddings are fixed-length, dense and distributed word representations that are used in natural language processing (NLP) applications. There are basically two types of word embedding models which are non-contextual (static) models and contextual models. The former method generates a single embedding for a word regardless of its context, while the latter method produces distinct embeddings for a word based on the specific contexts in which it appears. There are plenty of works that compare contextual and non-contextual embedding models within their respective groups in different languages. However, the number of studies that compare the models in these two groups with each other is very few and there is no such study in Turkish. This process necessitates converting contextual embeddings into static embeddings. In this paper, we compare and evaluate the performance of several…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
