Evaluating Word Embeddings in Multi-label Classification Using Fine-grained Name Typing
Yadollah Yaghoobzadeh, Katharina Kann, Hinrich Sch\"utze

TL;DR
This paper introduces a novel evaluation method for word embeddings using multi-label classification on fine-grained name typing, enabling detailed analysis of embedding properties without relying on sentence context.
Contribution
It proposes a new large-scale, fine-grained evaluation approach for word embeddings based on multi-label classification of name types, complementing existing methods.
Findings
The method effectively assesses embedding quality at a fine-grained level.
It provides a large, detailed dataset for embedding evaluation.
The approach avoids confounding factors present in sentence-based evaluations.
Abstract
Embedding models typically associate each word with a single real-valued vector, representing its different properties. Evaluation methods, therefore, need to analyze the accuracy and completeness of these properties in embeddings. This requires fine-grained analysis of embedding subspaces. Multi-label classification is an appropriate way to do so. We propose a new evaluation method for word embeddings based on multi-label classification given a word embedding. The task we use is fine-grained name typing: given a large corpus, find all types that a name can refer to based on the name embedding. Given the scale of entities in knowledge bases, we can build datasets for this task that are complementary to the current embedding evaluation datasets in: they are very large, contain fine-grained classes, and allow the direct evaluation of embeddings without confounding factors like sentence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
