Gender Bias and Universal Substitution Adversarial Attacks on Grammatical Error Correction Systems for Automated Assessment
Vyas Raina, Mark Gales

TL;DR
This paper investigates how gender bias and universal substitution adversarial attacks can deceive grammatical error correction systems, which are used for automated language assessment, by making small input changes that hide errors and falsely improve fluency scores.
Contribution
It introduces a realistic adversarial attack method targeting GEC systems used in automated assessment, highlighting vulnerabilities related to gender bias and attack effectiveness.
Findings
Adversarial attacks can significantly reduce detected errors in GEC outputs.
Gender bias influences the success of adversarial manipulations.
The proposed attack demonstrates potential to deceive GEC-based assessments.
Abstract
Grammatical Error Correction (GEC) systems perform a sequence-to-sequence task, where an input word sequence containing grammatical errors, is corrected for these errors by the GEC system to output a grammatically correct word sequence. With the advent of deep learning methods, automated GEC systems have become increasingly popular. For example, GEC systems are often used on speech transcriptions of English learners as a form of assessment and feedback - these powerful GEC systems can be used to automatically measure an aspect of a candidate's fluency. The count of \textit{edits} from a candidate's input sentence (or essay) to a GEC system's grammatically corrected output sentence is indicative of a candidate's language ability, where fewer edits suggest better fluency. The count of edits can thus be viewed as a \textit{fluency score} with zero implying perfect fluency. However,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Natural Language Processing Techniques · Hate Speech and Cyberbullying Detection
