A Dynamic, Interpreted CheckList for Meaning-oriented NLG Metric Evaluation -- through the Lens of Semantic Similarity Rating
Laura Zeidler, Juri Opitz, Anette Frank

TL;DR
This paper introduces a dynamic, meaning-focused CheckList for evaluating NLG metrics, especially for AMR-to-text, and demonstrates its utility by developing a new graph-based metric called GraCo.
Contribution
It proposes a novel, interpretable CheckList organized around linguistic phenomena for NLG metric evaluation and introduces GraCo, a new AMR-based lexical cohesion graph metric.
Findings
CheckList reveals strengths and weaknesses of NLG metrics.
GraCo shows promise as a meaning-oriented evaluation metric.
Graph-based approaches benefit from AMR in NLG evaluation.
Abstract
Evaluating the quality of generated text is difficult, since traditional NLG evaluation metrics, focusing more on surface form than meaning, often fail to assign appropriate scores. This is especially problematic for AMR-to-text evaluation, given the abstract nature of AMR. Our work aims to support the development and improvement of NLG evaluation metrics that focus on meaning, by developing a dynamic CheckList for NLG metrics that is interpreted by being organized around meaning-relevant linguistic phenomena. Each test instance consists of a pair of sentences with their AMR graphs and a human-produced textual semantic similarity or relatedness score. Our CheckList facilitates comparative evaluation of metrics and reveals strengths and weaknesses of novel and traditional metrics. We demonstrate the usefulness of CheckList by designing a new metric GraCo that computes lexical cohesion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
