Multi-head Sequence Tagging Model for Grammatical Error Correction

Kamal Al-Sabahi; Kang Yang; Wangwang Liu; Guanyu Jiang; Xian Li; Ming; Yang

arXiv:2410.16473·cs.CL·October 23, 2024

Multi-head Sequence Tagging Model for Grammatical Error Correction

Kamal Al-Sabahi, Kang Yang, Wangwang Liu, Guanyu Jiang, Xian Li, Ming, Yang

PDF

2 Repos

TL;DR

This paper introduces a multi-head, multi-task sequence tagging model for grammatical error correction that leverages synthetic data and novel character transformations, significantly outperforming previous methods.

Contribution

It proposes a novel multi-head, multi-task learning approach with synthetic data generation and character transformations for improved GEC performance.

Findings

01

Achieves state-of-the-art F0.5 scores on BEA-19 and CoNLL-14 datasets.

02

Outperforms recent GEC models with significant margin.

03

Effective use of synthetic data and multi-task learning enhances correction accuracy.

Abstract

To solve the Grammatical Error Correction (GEC) problem , a mapping between a source sequence and a target one is needed, where the two differ only on few spans. For this reason, the attention has been shifted to the non-autoregressive or sequence tagging models. In which, the GEC has been simplified from Seq2Seq to labeling the input tokens with edit commands chosen from a large edit space. Due to this large number of classes and the limitation of the available datasets, the current sequence tagging approaches still have some issues handling a broad range of grammatical errors just by being laser-focused on one single task. To this end, we simplified the GEC further by dividing it into seven related subtasks: Insertion, Deletion, Merge, Substitution, Transformation, Detection, and Correction, with Correction being our primary focus. A distinct classification head is dedicated to each…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoftmax · Attention Is All You Need · Sigmoid Activation · Tanh Activation · Denoising Autoencoder · Long Short-Term Memory · Sequence to Sequence