A Simple Recipe for Multilingual Grammatical Error Correction

Sascha Rothe; Jonathan Mallinson; Eric Malmi; Sebastian Krause,; Aliaksei Severyn

arXiv:2106.03830·cs.CL·August 10, 2022

A Simple Recipe for Multilingual Grammatical Error Correction

Sascha Rothe, Jonathan Mallinson, Eric Malmi, Sebastian Krause,, Aliaksei Severyn

PDF

2 Repos 6 Models

TL;DR

This paper introduces a straightforward approach for training multilingual GEC models using synthetic data generation and large-scale language models, achieving state-of-the-art results across four languages and simplifying the training process.

Contribution

It proposes a language-agnostic synthetic data generation method and demonstrates that fine-tuning large multilingual models on this data improves GEC performance.

Findings

01

Achieved new state-of-the-art results in English, Czech, German, and Russian GEC benchmarks.

02

Created the cLang-8 dataset by cleaning lang-8 targets with the gT5 model.

03

Single-step fine-tuning on cLang-8 surpasses previous methods.

Abstract

This paper presents a simple recipe to train state-of-the-art multilingual Grammatical Error Correction (GEC) models. We achieve this by first proposing a language-agnostic method to generate a large number of synthetic examples. The second ingredient is to use large-scale multilingual language models (up to 11B parameters). Once fine-tuned on language-specific supervised sets we surpass the previous state-of-the-art results on GEC benchmarks in four languages: English, Czech, German and Russian. Having established a new set of baselines for GEC, we make our results easily reproducible and accessible by releasing a cLang-8 dataset. It is produced by using our best model, which we call gT5, to clean the targets of a widely used yet noisy lang-8 dataset. cLang-8 greatly simplifies typical GEC training pipelines composed of multiple fine-tuning stages -- we demonstrate that performing a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.