Neural Language Correction with Character-Based Attention

Ziang Xie; Anand Avati; Naveen Arivazhagan; Dan Jurafsky; Andrew Y. Ng

arXiv:1603.09727·cs.CL·April 1, 2016·124 cites

Neural Language Correction with Character-Based Attention

Ziang Xie, Anand Avati, Naveen Arivazhagan, Dan Jurafsky, Andrew Y. Ng

PDF

Open Access 3 Repos

TL;DR

This paper introduces a character-based neural encoder-decoder model with attention for language correction, effectively handling orthographic errors and improving performance on learner text datasets.

Contribution

The paper presents a novel character-level neural correction model with attention, outperforming previous methods and demonstrating the benefit of training on synthesized errors.

Findings

01

Achieved state-of-the-art F0.5 score on CoNLL 2014 dataset.

02

Character-level model handles out-of-vocabulary and orthographic errors.

03

Training with synthesized errors improves correction performance.

Abstract

Natural language correction has the potential to help language learners improve their writing skills. While approaches with separate classifiers for different error types have high precision, they do not flexibly handle errors such as redundancy or non-idiomatic phrasing. On the other hand, word and phrase-based machine translation methods are not designed to cope with orthographic errors, and have recently been outpaced by neural models. Motivated by these issues, we present a neural network-based approach to language correction. The core component of our method is an encoder-decoder recurrent neural network with an attention mechanism. By operating at the character level, the network avoids the problem of out-of-vocabulary words. We illustrate the flexibility of our approach on dataset of noisy, user-generated text collected from an English learner forum. When combined with a language…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification