Revisiting Grammatical Error Correction Evaluation and Beyond

Peiyuan Gong; Xuebo Liu; Heyan Huang; Min Zhang

arXiv:2211.01635·cs.CL·November 4, 2022

Revisiting Grammatical Error Correction Evaluation and Beyond

Peiyuan Gong, Xuebo Liu, Heyan Huang, Min Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces PT-M2, a novel pretrained-based evaluation metric for grammatical error correction that focuses on corrected parts, significantly improving correlation with human judgments and setting a new state-of-the-art.

Contribution

It proposes a new GEC evaluation metric, PT-M2, that selectively scores corrected parts using pretrained models, addressing limitations of previous PT-based metrics.

Findings

01

PT-M2 achieves a Pearson correlation of 0.949 on CoNLL14.

02

PT-M2 outperforms existing GEC evaluation methods.

03

PT-M2 is robust for evaluating competitive GEC systems.

Abstract

Pretraining-based (PT-based) automatic evaluation metrics (e.g., BERTScore and BARTScore) have been widely used in several sentence generation tasks (e.g., machine translation and text summarization) due to their better correlation with human judgments over traditional overlap-based methods. Although PT-based methods have become the de facto standard for training grammatical error correction (GEC) systems, GEC evaluation still does not benefit from pretrained knowledge. This paper takes the first step towards understanding and improving GEC evaluation with pretraining. We first find that arbitrarily applying PT-based metrics to GEC evaluation brings unsatisfactory correlation results because of the excessive attention to inessential systems outputs (e.g., unchanged parts). To alleviate the limitation, we propose a novel GEC evaluation metric to achieve the best of both worlds, namely…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pygongnlp/pt-m2
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification