A Comparison of WordNet and Roget's Taxonomy for Measuring Semantic   Similarity

Michael Mc Hale (Air Force Research Laboratory)

arXiv:cmp-lg/9809003·cmp-lg·May 23, 2007·3 cites

A Comparison of WordNet and Roget's Taxonomy for Measuring Semantic Similarity

Michael Mc Hale (Air Force Research Laboratory)

PDF

Open Access

TL;DR

This paper evaluates the effectiveness of Roget's Thesaurus as a taxonomy for semantic similarity measurement, comparing it with WordNet and traditional edge counting methods, showing promising results close to human judgments.

Contribution

It introduces the use of Roget's Thesaurus for semantic similarity measurement and compares its performance with existing methods like WordNet.

Findings

01

Edge counting with Roget's achieves a correlation of r=0.88 with human judgments.

02

Roget's Thesaurus performs comparably to WordNet in semantic similarity tasks.

03

Traditional edge counting is surprisingly effective for measuring semantic similarity.

Abstract

This paper presents the results of using Roget's International Thesaurus as the taxonomy in a semantic similarity measurement task. Four similarity metrics were taken from the literature and applied to Roget's The experimental evaluation suggests that the traditional edge counting approach does surprisingly well (a correlation of r=0.88 with a benchmark set of human similarity judgements, with an upper bound of r=0.90 for human subjects performing the same task.)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Advanced Text Analysis Techniques