Gender-Inclusive Grammatical Error Correction through Augmentation
Gunnar Lund, Kostiantyn Omelianchuk, Igor Samokhin

TL;DR
This paper identifies gender bias in grammatical error correction systems and introduces a novel data augmentation method to reduce bias, especially for singular 'they', without sacrificing system quality.
Contribution
It presents a new augmentation technique for singular 'they' and refines existing methods to mitigate gender bias in GEC systems.
Findings
Bias in GEC systems towards gendered terms is quantifiable.
Augmentation techniques effectively reduce gender bias.
System quality remains stable after bias mitigation.
Abstract
In this paper we show that GEC systems display gender bias related to the use of masculine and feminine terms and the gender-neutral singular "they". We develop parallel datasets of texts with masculine and feminine terms and singular "they" and use them to quantify gender bias in three competitive GEC systems. We contribute a novel data augmentation technique for singular "they" leveraging linguistic insights about its distribution relative to plural "they". We demonstrate that both this data augmentation technique and a refinement of a similar augmentation technique for masculine and feminine terms can generate training data that reduces bias in GEC systems, especially with respect to singular "they" while maintaining the same level of quality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification · Topic Modeling
