RedPenNet for Grammatical Error Correction: Outputs to Tokens,   Attentions to Spans

Bohdan Didenko (1); Andrii Sameliuk (1) ((1) WebSpellChecker LLC /; Ukraine)

arXiv:2309.10898·cs.CL·September 21, 2023

RedPenNet for Grammatical Error Correction: Outputs to Tokens, Attentions to Spans

Bohdan Didenko (1), Andrii Sameliuk (1) ((1) WebSpellChecker LLC /, Ukraine)

PDF

Open Access 1 Repo

TL;DR

RedPenNet is a novel architecture for grammatical error correction that reduces redundancies and balances autoregressive and sequence tagging approaches, achieving state-of-the-art results on key benchmarks.

Contribution

The paper introduces RedPenNet, a new model that improves grammatical error correction by combining advantages of autoregressive and sequence tagging methods, with reduced redundancies.

Findings

01

Achieved $F_{0.5}$ score of 77.60 on BEA-2019 test set.

02

Achieved $F_{0.5}$ score of 67.71 on UAGEC+Fluency benchmark.

03

Presented results as state-of-the-art for GEC tasks, excluding system combination.

Abstract

The text editing tasks, including sentence fusion, sentence splitting and rephrasing, text simplification, and Grammatical Error Correction (GEC), share a common trait of dealing with highly similar input and output sequences. This area of research lies at the intersection of two well-established fields: (i) fully autoregressive sequence-to-sequence approaches commonly used in tasks like Neural Machine Translation (NMT) and (ii) sequence tagging techniques commonly used to address tasks such as Part-of-speech tagging, Named-entity recognition (NER), and similar. In the pursuit of a balanced architecture, researchers have come up with numerous imaginative and unconventional solutions, which we're discussing in the Related Works section. Our approach to addressing text editing tasks is called RedPenNet and is aimed at reducing architectural and parametric redundancies presented in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

webspellchecker/unlp-2023-shared-task
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Text Readability and Simplification · Topic Modeling