Exploiting Unlabeled Data for Neural Grammatical Error Detection

Zhuoran Liu; Yang Liu

arXiv:1611.08987·cs.CL·November 30, 2016·2 cites

Exploiting Unlabeled Data for Neural Grammatical Error Detection

Zhuoran Liu, Yang Liu

PDF

Open Access

TL;DR

This paper presents a neural network approach that leverages unlabeled data for grammatical error detection, improving performance over traditional methods by using attention mechanisms to capture long-distance dependencies.

Contribution

It introduces a novel method to utilize unlabeled data for training neural grammatical error detection models, addressing data scarcity issues.

Findings

01

Significant performance improvement over SVMs and CNNs

02

Effective use of unlabeled data for error detection

03

Attention mechanism captures long-distance dependencies

Abstract

Identifying and correcting grammatical errors in the text written by non-native writers has received increasing attention in recent years. Although a number of annotated corpora have been established to facilitate data-driven grammatical error detection and correction approaches, they are still limited in terms of quantity and coverage because human annotation is labor-intensive, time-consuming, and expensive. In this work, we propose to utilize unlabeled data to train neural network based grammatical error detection models. The basic idea is to cast error detection as a binary classification problem and derive positive and negative training examples from unlabeled data. We introduce an attention-based neural network to capture long-distance dependencies that influence the word being detected. Experiments show that the proposed approach significantly outperforms SVMs and convolutional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification