A Finite State and Rule-based Akshara to Prosodeme (A2P) Converter in Hindi
Somnath Roy

TL;DR
This paper presents a rule-based and finite state machine approach for converting Hindi text into prosodic units, achieving over 99% accuracy in phonological processing tasks.
Contribution
It introduces a novel finite state and rule-based system for Hindi A2P conversion, including nonlinear phonological rules for schwa deletion and prosodic labeling.
Findings
Over 99% accuracy in syllabification and prosodic labeling
Effective handling of schwa deletion in various word forms
Implementation in Python demonstrating practical usability
Abstract
This article describes a software module called Akshara to Prosodeme (A2P) converter in Hindi. It converts an input grapheme into prosedeme (sequence of phonemes with the specification of syllable boundaries and prosodic labels). The software is based on two proposed finite state machines\textemdash one for the syllabification and another for the syllable labeling. In addition to that, it also uses a set of nonlinear phonological rules proposed for foot formation in Hindi, which encompass solutions to schwa-deletion in simple, compound, derived and inflected words. The nonlinear phonological rules are based on metrical phonology with the provision of recursive foot structure. A software module is implemented in Python. The testing of the software for syllabification, syllable labeling, schwa deletion and prosodic labeling yield an accuracy of more than 99% on a lexicon of size 28664…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Phonetics and Phonology Research
