Testing APSyn against Vector Cosine on Similarity Estimation
Enrico Santus, Emmanuele Chersoni, Alessandro Lenci, Chu-Ren Huang,, Philippe Blache

TL;DR
This paper evaluates APSyn, a new similarity measure based on context intersection, and finds it competitive with Vector Cosine in word similarity tasks, addressing some of Cosine's limitations.
Contribution
The study introduces and empirically tests APSyn, a novel similarity measure that considers context intersection and relevance, showing its effectiveness over traditional Vector Cosine.
Findings
APSyn performs competitively with Vector Cosine in similarity estimation.
APSyn addresses some weaknesses of Vector Cosine, especially on genuine similarity tasks.
Results demonstrate APSyn's robustness across multiple test sets.
Abstract
In Distributional Semantic Models (DSMs), Vector Cosine is widely used to estimate similarity between word vectors, although this measure was noticed to suffer from several shortcomings. The recent literature has proposed other methods which attempt to mitigate such biases. In this paper, we intend to investigate APSyn, a measure that computes the extent of the intersection between the most associated contexts of two target words, weighting it by context relevance. We evaluated this metric in a similarity estimation task on several popular test sets, and our results show that APSyn is in fact highly competitive, even with respect to the results reported in the literature for word embeddings. On top of it, APSyn addresses some of the weaknesses of Vector Cosine, performing well also on genuine similarity estimation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning in Healthcare · Mental Health via Writing
