Error-tolerant Finite State Recognition with Applications to Morphological Analysis and Spelling Correction
Kemal Oflazer (Department of Computer Engineering, Information, Science Bilkent University, Ankara Turkey)

TL;DR
This paper introduces error-tolerant finite state recognition algorithms for morphological analysis and spelling correction, demonstrating their efficiency and applicability across multiple languages, including Turkish, English, and European languages.
Contribution
It presents novel algorithms for error-tolerant recognition using finite state transducers, enabling accurate morphological analysis and spelling correction across diverse languages.
Findings
Efficient candidate generation within 10-45 ms for large word lists
Successful application to Turkish agglutinative morphology
Effective spelling correction for multiple European languages
Abstract
Error-tolerant recognition enables the recognition of strings that deviate mildly from any string in the regular set recognized by the underlying finite state recognizer. Such recognition has applications in error-tolerant morphological processing, spelling correction, and approximate string matching in information retrieval. After a description of the concepts and algorithms involved, we give examples from two applications: In the context of morphological analysis, error-tolerant recognition allows misspelled input word forms to be corrected, and morphologically analyzed concurrently. We present an application of this to error-tolerant analysis of agglutinative morphology of Turkish words. The algorithm can be applied to morphological analysis of any language whose morphology is fully captured by a single (and possibly very large) finite state transducer, regardless of the word…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Algorithms and Data Compression · Topic Modeling
