Wronging a Right: Generating Better Errors to Improve Grammatical Error   Detection

Sudhanshu Kasewa; Pontus Stenetorp; Sebastian Riedel

arXiv:1810.00668·cs.CL·October 2, 2018

Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection

Sudhanshu Kasewa, Pontus Stenetorp, Sebastian Riedel

PDF

1 Repo

TL;DR

This paper presents a method for generating realistic grammatical errors using sequence-to-sequence models to create synthetic training data, significantly improving grammatical error detection performance.

Contribution

It introduces a cost-effective approach to generate high-quality synthetic errors, enhancing grammatical error detection models beyond previous state-of-the-art results.

Findings

01

Synthetic error data improves detection accuracy

02

Achieves over 5% $F_{0.5}$ score improvement

03

Generated errors are mostly human-like

Abstract

Grammatical error correction, like other machine learning tasks, greatly benefits from large quantities of high quality training data, which is typically expensive to produce. While writing a program to automatically generate realistic grammatical errors would be difficult, one could learn the distribution of naturallyoccurring errors and attempt to introduce them into other datasets. Initial work on inducing errors in this way using statistical machine translation has shown promise; we investigate cheaply constructing synthetic samples, given a small corpus of human-annotated data, using an off-the-rack attentive sequence-to-sequence model and a straight-forward post-processing procedure. Our approach yields error-filled artificial data that helps a vanilla bi-directional LSTM to outperform the previous state of the art at grammatical error detection, and a previously introduced model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

skasewa/wronging
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.