Issues in evaluating semantic spaces using word analogies

Tal Linzen

arXiv:1606.07736·cs.CL·June 27, 2016

Issues in evaluating semantic spaces using word analogies

Tal Linzen

PDF

TL;DR

This paper critically examines the offset method for evaluating semantic spaces via word analogies, revealing its limitations and proposing baselines to improve its effectiveness in assessing semantic relations.

Contribution

The paper identifies flaws in the offset method's reliance on cosine similarity and introduces simple baselines to enhance evaluation accuracy.

Findings

01

Offset method conflates semantic consistency with neighborhood structure

02

Proposed baselines improve evaluation reliability

03

Highlights limitations of current semantic space assessments

Abstract

The offset method for solving word analogies has become a standard evaluation tool for vector-space semantic models: it is considered desirable for a space to represent semantic relations as consistent vector offsets. We show that the method's reliance on cosine similarity conflates offset consistency with largely irrelevant neighborhood structure, and propose simple baselines that should be used to improve the utility of the method in vector space evaluation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.