ParsiNorm: A Persian Toolkit for Speech Processing Normalization
Romina Oji, Seyedeh Fatemeh Razavi, Sajjad Abdi Dehsorkh, Alireza, Hariri, Hadi Asheri, Reshad Hosseini

TL;DR
ParsiNorm is an open-source Persian normalization toolkit designed for speech processing, converting symbols, numbers, and text into pronunciation-ready formats, improving the performance of embedded speech models.
Contribution
This paper introduces the first comprehensive Persian normalization toolkit tailored for speech processing modules, addressing a gap in existing language processing tools.
Findings
Outperforms existing Persian normalization tools in speech applications
Achieves accurate sentence separation comparable to HAZM and Parsivar
Demonstrates effective normalization on Persian Wikipedia data
Abstract
In general, speech processing models consist of a language model along with an acoustic model. Regardless of the language model's complexity and variants, three critical pre-processing steps are needed in language models: cleaning, normalization, and tokenization. Among mentioned steps, the normalization step is so essential to format unification in pure textual applications. However, for embedded language models in speech processing modules, normalization is not limited to format unification. Moreover, it has to convert each readable symbol, number, etc., to how they are pronounced. To the best of our knowledge, there is no Persian normalization toolkits for embedded language models in speech processing modules, So in this paper, we propose an open-source normalization toolkit for text processing in speech applications. Briefly, we consider different readable Persian text like symbols…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Speech and dialogue systems
