End-to-End Spoken Grammatical Error Correction
Mengjie Qian, Rao Ma, Stefano Bann\`o, Mark J.F. Gales, Kate M. Knill

TL;DR
This paper explores an end-to-end framework for spoken grammatical error correction (SGEC) using the Whisper model, addressing challenges like data scarcity and error propagation, and demonstrating significant performance improvements.
Contribution
It introduces a novel end-to-end SGEC system with pseudo-labeling, contextual information, and a reference alignment process to enhance feedback accuracy.
Findings
Significant performance boost on LNG and S&I corpora.
Effective pseudo-labeling increases training data from 77 to over 2500 hours.
Novel reference alignment improves feedback precision.
Abstract
Grammatical Error Correction (GEC) and feedback play a vital role in supporting second language (L2) learners, educators, and examiners. While written GEC is well-established, spoken GEC (SGEC), aiming to provide feedback based on learners' speech, poses additional challenges due to disfluencies, transcription errors, and the lack of structured input. SGEC systems typically follow a cascaded pipeline consisting of Automatic Speech Recognition (ASR), disfluency detection, and GEC, making them vulnerable to error propagation across modules. This work examines an End-to-End (E2E) framework for SGEC and feedback generation, highlighting challenges and possible solutions when developing these systems. Cascaded, partial-cascaded and E2E architectures are compared, all built on the Whisper foundation model. A challenge for E2E systems is the scarcity of GEC labeled spoken data. To address…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
