edATLAS: An Efficient Disambiguation Algorithm for Texting in Languages with Abugida Scripts
Sourav Ghosh, Sourabh Vasant Gothe, Chandramouli Sanchi, Barath Raj, Kandur Raja

TL;DR
This paper introduces edATLAS, an efficient disambiguation algorithm that improves text input in abugida scripts and romanized forms, significantly enhancing typing speed and accuracy in languages like Hindi, Bengali, and Thai.
Contribution
The paper presents a novel disambiguation algorithm for abugida scripts and romanized text, addressing unique challenges in native language typing and improving performance metrics.
Findings
Typing speed increased by up to 25.13% in tested languages.
Error Correction F1 score improved by 10.03%.
Next Word Prediction accuracy increased by 62.50%.
Abstract
Abugida refers to a phonogram writing system where each syllable is represented using a single consonant or typographic ligature, along with a default vowel or optional diacritic(s) to denote other vowels. However, texting in these languages has some unique challenges in spite of the advent of devices with soft keyboard supporting custom key layouts. The number of characters in these languages is large enough to require characters to be spread over multiple views in the layout. Having to switch between views many times to type a single word hinders the natural thought process. This prevents popular usage of native keyboard layouts. On the other hand, supporting romanized scripts (native words transcribed using Latin characters) with language model based suggestions is also set back by the lack of uniform romanization rules. To this end, we propose a disambiguation algorithm and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
