A Winnow-Based Approach to Context-Sensitive Spelling Correction
Andrew R. Golding, Dan Roth

TL;DR
This paper introduces WinSpell, a Winnow-based algorithm for context-sensitive spelling correction, demonstrating superior accuracy and adaptability over existing methods by effectively handling high-dimensional features and learning better linear separators.
Contribution
The paper presents a novel Winnow-based approach that combines variants of Winnow and weighted-majority voting for improved context-sensitive spelling correction.
Findings
WinSpell outperforms BaySpell in accuracy with full feature sets.
WinSpell achieves the highest performance compared to other systems in literature.
WinSpell adapts better to different test corpora by combining supervised and unsupervised learning.
Abstract
A large class of machine-learning problems in natural language require the characterization of linguistic context. Two characteristic properties of such problems are that their feature space is of very high dimensionality, and their target concepts refer to only a small subset of the features in the space. Under such conditions, multiplicative weight-update algorithms such as Winnow have been shown to have exceptionally good theoretical properties. We present an algorithm combining variants of Winnow and weighted-majority voting, and apply it to a problem in the aforementioned class: context-sensitive spelling correction. This is the task of fixing spelling errors that happen to result in valid words, such as substituting "to" for "too", "casual" for "causal", etc. We evaluate our algorithm, WinSpell, by comparing it against BaySpell, a statistics-based method representing the state of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and dialogue systems · Natural Language Processing Techniques
