Exploiting N-Best Hypotheses to Improve an SMT Approach to Grammatical Error Correction
Duc Tam Hoang, Shamil Chollampatt, Hwee Tou Ng

TL;DR
This paper enhances grammatical error correction by leveraging n-best hypotheses from SMT systems, using a classifier to re-rank and select edits, leading to significant accuracy improvements on benchmark datasets.
Contribution
It introduces a novel method that exploits multiple hypotheses from SMT for GEC, improving accuracy through a classifier-based re-ranking approach.
Findings
Achieved statistically significant accuracy improvements
Outperformed previous state-of-the-art GEC systems
Effective use of n-best hypotheses for error correction
Abstract
Grammatical error correction (GEC) is the task of detecting and correcting grammatical errors in texts written by second language learners. The statistical machine translation (SMT) approach to GEC, in which sentences written by second language learners are translated to grammatically correct sentences, has achieved state-of-the-art accuracy. However, the SMT approach is unable to utilize global context. In this paper, we propose a novel approach to improve the accuracy of GEC, by exploiting the n-best hypotheses generated by an SMT approach. Specifically, we build a classifier to score the edits in the n-best hypotheses. The classifier can be used to select appropriate edits or re-rank the n-best hypotheses. We apply these methods to a state-of-the-art GEC system that uses the SMT approach. Our experiments show that our methods achieve statistically significant improvements in accuracy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
