Algorithms for certain classes of Tamil Spelling correction
Muthiah Annamalai, T. Shrinivasan

TL;DR
This paper reviews and proposes algorithmic techniques for Tamil spelling correction, addressing challenges posed by the language's morphological complexity and out-of-dictionary words, to improve spell checking tools.
Contribution
It summarizes existing algorithms for Tamil spelling errors and suggests improvements for handling conjoined words and morphological variations.
Findings
Proposed algorithms efficiently handle out-of-dictionary words
Summarized known techniques for Tamil spelling correction
Identified gaps in current rule-based spell checkers
Abstract
Tamil language has an agglutinative, diglossic, alpha-syllabary structure which provides a significant combinatorial explosion of morphological forms all of which are effectively used in Tamil prose, poetry from antiquity to the modern age in an unbroken chain of continuity. However, for the language understanding, spelling correction purposes some of these present challenges as out-of-dictionary words. In this paper the authors propose algorithmic techniques to handle specific problems of conjoined-words (out-of-dictionary) (transliteration)[thendRalkattRu] = [thendRal]+[kattRu] when parts are alone present in word-list in efficient way. Morphological structure of Tamil makes it necessary to depend on synthesis-analysis approach and dictionary lists will never be sufficient to truly capture the language. In this paper we have attempted to make a summary of various known algorithms for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Algorithms and Data Compression · Handwritten Text Recognition Techniques
