TL;DR
NeuSpell is an open-source toolkit that improves English spelling correction by leveraging context-aware neural models trained on synthetic errors, achieving significant accuracy gains and supporting practical applications like adversarial robustness.
Contribution
The paper introduces NeuSpell, a comprehensive spelling correction toolkit with novel context-aware neural models trained on synthetic errors, outperforming existing methods.
Findings
Training on synthetic errors improves correction rates by 9%.
Using contextual representations boosts correction by an additional 3%.
The toolkit is effective in practical scenarios like adversarial misspellings.
Abstract
We introduce NeuSpell, an open-source toolkit for spelling correction in English. Our toolkit comprises ten different models, and benchmarks them on naturally occurring misspellings from multiple sources. We find that many systems do not adequately leverage the context around the misspelt token. To remedy this, (i) we train neural models using spelling errors in context, synthetically constructed by reverse engineering isolated misspellings; and (ii) use contextual representations. By training on our synthetic examples, correction rates improve by 9% (absolute) compared to the case when models are trained on randomly sampled character perturbations. Using richer contextual representations boosts the correction rate by another 3%. Our toolkit enables practitioners to use our proposed and existing spelling correction systems, both via a unified command line, as well as a web interface.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
