Issues in evaluating semantic spaces using word analogies
Tal Linzen

TL;DR
This paper critically examines the offset method for evaluating semantic spaces via word analogies, revealing its limitations and proposing baselines to improve its effectiveness in assessing semantic relations.
Contribution
The paper identifies flaws in the offset method's reliance on cosine similarity and introduces simple baselines to enhance evaluation accuracy.
Findings
Offset method conflates semantic consistency with neighborhood structure
Proposed baselines improve evaluation reliability
Highlights limitations of current semantic space assessments
Abstract
The offset method for solving word analogies has become a standard evaluation tool for vector-space semantic models: it is considered desirable for a space to represent semantic relations as consistent vector offsets. We show that the method's reliance on cosine similarity conflates offset consistency with largely irrelevant neighborhood structure, and propose simple baselines that should be used to improve the utility of the method in vector space evaluation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
