Exploring the Capacity of a Large-scale Masked Language Model to   Recognize Grammatical Errors

Ryo Nagata; Manabu Kimura; and Kazuaki Hanawa

arXiv:2108.12216·cs.CL·August 30, 2021

Exploring the Capacity of a Large-scale Masked Language Model to Recognize Grammatical Errors

Ryo Nagata, Manabu Kimura, and Kazuaki Hanawa

PDF

Open Access

TL;DR

This study demonstrates that large-scale masked language models like BERT can efficiently learn grammatical error detection with minimal training data, achieving high accuracy and generalization, and can provide educational feedback.

Contribution

The paper reveals that BERT-based models require only a small portion of data to match full-data performance and can learn error detection rules with few samples, enhancing grammatical error detection methods.

Findings

01

BERT-based error detection achieves similar performance with 5-10% of training data.

02

Recall improves faster than precision as training data increases.

03

The model can learn grammatical rules effectively from pseudo error data.

Abstract

In this paper, we explore the capacity of a language model-based method for grammatical error detection in detail. We first show that 5 to 10% of training data are enough for a BERT-based error detection method to achieve performance equivalent to a non-language model-based method can achieve with the full training data; recall improves much faster with respect to training data size in the BERT-based method than in the non-language model method while precision behaves similarly. These suggest that (i) the BERT-based method should have a good knowledge of grammar required to recognize certain types of error and that (ii) it can transform the knowledge into error detection rules by fine-tuning with a few training samples, which explains its high generalization ability in grammatical error detection. We further show with pseudo error data that it actually exhibits such nice properties in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification