An Extended Sequence Tagging Vocabulary for Grammatical Error Correction
Stuart Mesham, Christopher Bryant, Marek Rei, Zheng Yuan

TL;DR
This paper introduces an extended sequence tagging vocabulary for grammatical error correction, incorporating specialised tags for spelling and morphology, leading to improved performance and generalisation in error correction tasks.
Contribution
The paper presents a novel extended tagset for sequence tagging in GEC, enhancing correction capabilities and outperforming baseline models on benchmark datasets.
Findings
Improved overall GEC performance with the new tagset.
Enhanced correction of spelling and morphological errors.
Ensemble models with the new tagset outperform baselines.
Abstract
We extend a current sequence-tagging approach to Grammatical Error Correction (GEC) by introducing specialised tags for spelling correction and morphological inflection using the SymSpell and LemmInflect algorithms. Our approach improves generalisation: the proposed new tagset allows a smaller number of tags to correct a larger range of errors. Our results show a performance improvement both overall and in the targeted error categories. We further show that ensembles trained with our new tagset outperform those trained with the baseline tagset on the public BEA benchmark.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification · Topic Modeling
