Spelling Error Correction Using a Nested RNN Model and Pseudo Training   Data

Hao Li; Yang Wang; Xinyu Liu; Zhichao Sheng; Si Wei

arXiv:1811.00238·cs.CL·November 2, 2018·25 cites

Spelling Error Correction Using a Nested RNN Model and Pseudo Training Data

Hao Li, Yang Wang, Xinyu Liu, Zhichao Sheng, Si Wei

PDF

Open Access

TL;DR

This paper introduces a nested RNN model for English spelling error correction that leverages pseudo data for training, achieving superior performance without traditional feature engineering or noisy channel models.

Contribution

The paper presents a novel nested RNN architecture combined with pseudo data generation for effective spelling error correction, bypassing traditional noisy channel approaches.

Findings

01

Outperforms existing spelling correction systems

02

Effective use of pseudo data improves accuracy

03

End-to-end training simplifies the correction process

Abstract

We propose a nested recurrent neural network (nested RNN) model for English spelling error correction and generate pseudo data based on phonetic similarity to train it. The model fuses orthographic information and context as a whole and is trained in an end-to-end fashion. This avoids feature engineering and does not rely on a noisy channel model as in traditional methods. Experiments show that the proposed method is superior to existing systems in correcting spelling errors.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis