A Survey of Word Embeddings Evaluation Methods
Amir Bakarov

TL;DR
This paper provides a comprehensive overview of evaluation methods for word embeddings, categorizing and analyzing 28 different intrinsic and extrinsic approaches, and discussing key challenges in the field.
Contribution
It offers the first systematic typology of word embedding evaluation methods, summarizing existing techniques and highlighting open problems and challenges.
Findings
16 intrinsic evaluation methods summarized
12 extrinsic evaluation methods summarized
Discussion of key challenges in evaluation
Abstract
Word embeddings are real-valued word representations able to capture lexical semantics and trained on natural language corpora. Models proposing these representations have gained popularity in the recent years, but the issue of the most adequate evaluation method still remains open. This paper presents an extensive overview of the field of word embeddings evaluation, highlighting main problems and proposing a typology of approaches to evaluation, summarizing 16 intrinsic methods and 12 extrinsic methods. I describe both widely-used and experimental methods, systematize information about evaluation datasets and discuss some key challenges.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Authorship Attribution and Profiling
