Distributional Measures of Semantic Distance: A Survey
Saif M. Mohammad, Graeme Hirst

TL;DR
This survey reviews distributional and WordNet-based measures of semantic distance, analyzing their strengths, limitations, and potential for aligning with human semantic judgments, including recent hybrid approaches.
Contribution
It provides a comprehensive comparison of distributional and knowledge-based semantic distance measures and discusses ways to improve their alignment with human perception.
Findings
Distributional measures are useful in resource-poor languages.
WordNet-based measures generally align better with human judgment.
Hybrid approaches show promise for improved semantic distance estimation.
Abstract
The ability to mimic human notions of semantic distance has widespread applications. Some measures rely only on raw text (distributional measures) and some rely on knowledge sources such as WordNet. Although extensive studies have been performed to compare WordNet-based measures with human judgment, the use of distributional measures as proxies to estimate semantic distance has received little attention. Even though they have traditionally performed poorly when compared to WordNet-based measures, they lay claim to certain uniquely attractive features, such as their applicability in resource-poor languages and their ability to mimic both semantic similarity and semantic relatedness. Therefore, this paper presents a detailed study of distributional measures. Particular attention is paid to flesh out the strengths and limitations of both WordNet-based and distributional measures, and how…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Language and cultural evolution
