LM-Critic: Language Models for Unsupervised Grammatical Error Correction
Michihiro Yasunaga, Jure Leskovec, Percy Liang

TL;DR
This paper introduces LM-Critic, an unsupervised method for grammatical error correction that leverages pretrained language models to generate training data without labeled pairs, outperforming existing methods.
Contribution
The paper proposes a novel LM-Critic approach that uses language models to bootstrap grammatical correction training data in an unsupervised manner, reducing reliance on manual annotations.
Findings
Outperforms existing unsupervised GEC methods (+7.7 F0.5)
Achieves competitive results in supervised setting (+0.5 F0.5)
Effective across multiple GEC datasets and domains
Abstract
Training a model for grammatical error correction (GEC) requires a set of labeled ungrammatical / grammatical sentence pairs, but manually annotating such pairs can be expensive. Recently, the Break-It-Fix-It (BIFI) framework has demonstrated strong results on learning to repair a broken program without any labeled examples, but this relies on a perfect critic (e.g., a compiler) that returns whether an example is valid or not, which does not exist for the GEC task. In this work, we show how to leverage a pretrained language model (LM) in defining an LM-Critic, which judges a sentence to be grammatical if the LM assigns it a higher probability than its local perturbations. We apply this LM-Critic and BIFI along with a large set of unlabeled sentences to bootstrap realistic ungrammatical / grammatical pairs for training a corrector. We evaluate our approach on GEC datasets across multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research
MethodsRepair
