Revisiting Meta-evaluation for Grammatical Error Correction

Masamune Kobayashi; Masato Mita; Mamoru Komachi

arXiv:2403.02674·cs.CL·May 28, 2024·1 cites

Revisiting Meta-evaluation for Grammatical Error Correction

Masamune Kobayashi, Masato Mita, Mamoru Komachi

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces SEEDA, a new dataset for GEC metric meta-evaluation, revealing that aligning evaluation granularity improves correlation and that traditional metrics struggle with neural system outputs.

Contribution

The paper proposes SEEDA, a novel dataset for GEC meta-evaluation, and demonstrates the importance of granularity alignment and the limitations of existing metrics.

Findings

01

Aligning granularity improves metric correlation.

02

Traditional metrics underperform on neural system outputs.

03

Edit-based metrics may be more underestimated than previously thought.

Abstract

Metrics are the foundation for automatic evaluation in grammatical error correction (GEC), with their evaluation of the metrics (meta-evaluation) relying on their correlation with human judgments. However, conventional meta-evaluations in English GEC encounter several challenges including biases caused by inconsistencies in evaluation granularity, and an outdated setup using classical systems. These problems can lead to misinterpretation of metrics and potentially hinder the applicability of GEC techniques. To address these issues, this paper proposes SEEDA, a new dataset for GEC meta-evaluation. SEEDA consists of corrections with human ratings along two different granularities: edit-based and sentence-based, covering 12 state-of-the-art systems including large language models (LLMs), and two human corrections with different focuses. The results of improved correlations by aligning the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Revisiting Meta-evaluation for Grammatical Error Correction· underline

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning · Educational Technology and Assessment · Natural Language Processing Techniques