Exploiting Unlabeled Data for Neural Grammatical Error Detection
Zhuoran Liu, Yang Liu

TL;DR
This paper presents a neural network approach that leverages unlabeled data for grammatical error detection, improving performance over traditional methods by using attention mechanisms to capture long-distance dependencies.
Contribution
It introduces a novel method to utilize unlabeled data for training neural grammatical error detection models, addressing data scarcity issues.
Findings
Significant performance improvement over SVMs and CNNs
Effective use of unlabeled data for error detection
Attention mechanism captures long-distance dependencies
Abstract
Identifying and correcting grammatical errors in the text written by non-native writers has received increasing attention in recent years. Although a number of annotated corpora have been established to facilitate data-driven grammatical error detection and correction approaches, they are still limited in terms of quantity and coverage because human annotation is labor-intensive, time-consuming, and expensive. In this work, we propose to utilize unlabeled data to train neural network based grammatical error detection models. The basic idea is to cast error detection as a binary classification problem and derive positive and negative training examples from unlabeled data. We introduce an attention-based neural network to capture long-distance dependencies that influence the word being detected. Experiments show that the proposed approach significantly outperforms SVMs and convolutional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
