Measuring the Measuring Tools: An Automatic Evaluation of Semantic   Metrics for Text Corpora

George Kour; Samuel Ackerman; Orna Raz; Eitan Farchi; Boaz Carmeli,; Ateret Anaby-Tavor

arXiv:2211.16259·cs.CL·November 30, 2022

Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text Corpora

George Kour, Samuel Ackerman, Orna Raz, Eitan Farchi, Boaz Carmeli,, Ateret Anaby-Tavor

PDF

Open Access 2 Repos

TL;DR

This paper introduces automatic, interpretable evaluation measures for semantic similarity metrics at the corpus level, enabling better comparison and understanding of their behavior in NLP applications.

Contribution

It proposes a novel set of evaluation measures for semantic similarity metrics, facilitating meaningful comparison and analysis of their characteristics.

Findings

01

New metrics better identify semantic distributional mismatch

02

Classical metrics are more sensitive to surface text perturbations

03

Evaluation measures effectively capture fundamental metric characteristics

Abstract

The ability to compare the semantic similarity between text corpora is important in a variety of natural language processing applications. However, standard methods for evaluating these metrics have yet to be established. We propose a set of automatic and interpretable measures for assessing the characteristics of corpus-level semantic similarity metrics, allowing sensible comparison of their behavior. We demonstrate the effectiveness of our evaluation measures in capturing fundamental characteristics by evaluating them on a collection of classical and state-of-the-art metrics. Our measures revealed that recently-developed metrics are becoming better in identifying semantic distributional mismatch while classical metrics are more sensitive to perturbations in the surface text levels.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques