RoMe: A Robust Metric for Evaluating Natural Language Generation

Md Rashad Al Hasan Rony; Liubov Kovriguina; Debanjan Chaudhuri,; Ricardo Usbeck; Jens Lehmann

arXiv:2203.09183·cs.CL·March 18, 2022·1 cites

RoMe: A Robust Metric for Evaluating Natural Language Generation

Md Rashad Al Hasan Rony, Liubov Kovriguina, Debanjan Chaudhuri,, Ricardo Usbeck, Jens Lehmann

PDF

Open Access 1 Repo

TL;DR

RoMe is a new automatic evaluation metric for NLG that combines semantic similarity, syntactic variation, and grammatical acceptability, showing stronger correlation with human judgments than existing metrics.

Contribution

This paper introduces RoMe, a robust, self-supervised neural network-based metric that integrates multiple aspects of language understanding for improved NLG evaluation.

Findings

01

RoMe outperforms existing metrics in correlating with human judgments.

02

RoMe demonstrates strong robustness across various NLG tasks.

03

Extensive analysis confirms RoMe's effectiveness and reliability.

Abstract

Evaluating Natural Language Generation (NLG) systems is a challenging task. Firstly, the metric should ensure that the generated hypothesis reflects the reference's semantics. Secondly, it should consider the grammatical quality of the generated sentence. Thirdly, it should be robust enough to handle various surface forms of the generated sentence. Thus, an effective evaluation metric has to be multifaceted. In this paper, we propose an automatic evaluation metric incorporating several core aspects of natural language understanding (language competence, syntactic and semantic variation). Our proposed metric, RoMe, is trained on language features such as semantic similarity combined with tree edit distance and grammatical acceptability, using a self-supervised neural network to assess the overall quality of the generated sentence. Moreover, we perform an extensive robustness analysis of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rashad101/rome
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsRank-One Model Editing