Is Word Error Rate a good evaluation metric for Speech Recognition in Indic Languages?
Priyanshi Shah, Harveen Singh Chadha, Anirudh Gupta, Ankur Dhuriya,, Neeraj Chhimwal, Rishabh Gaur, Vivek Raghavan

TL;DR
This paper introduces new error metrics, AWER and ACER, tailored for Indic languages with complex scripts, demonstrating improved interpretability and accuracy over traditional WER and CER in Hindi speech recognition.
Contribution
The paper proposes and validates new error metrics, AWER and ACER, specifically designed for Indic languages with complex character forms, enhancing error analysis in speech recognition systems.
Findings
AWER and ACER outperform WER and CER in Hindi speech recognition
Significant improvements in error interpretability up to 3% in AWER and 7% in ACER
Open source dataset for Hindi with new metrics and scripts
Abstract
We propose a new method for the calculation of error rates in Automatic Speech Recognition (ASR). This new metric is for languages that contain half characters and where the same character can be written in different forms. We implement our methodology in Hindi which is one of the main languages from Indic context and we think this approach is scalable to other similar languages containing a large character set. We call our metrics Alternate Word Error Rate (AWER) and Alternate Character Error Rate (ACER). We train our ASR models using wav2vec 2.0\cite{baevski2020wav2vec} for Indic languages. Additionally we use language models to improve our model performance. Our results show a significant improvement in analyzing the error rates at word and character level and the interpretability of the ASR system is improved upto \% in AWER and \% in ACER for Hindi. Our experiments suggest…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Speech and Audio Processing
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Stochastic Dueling Network · Softmax · Retrace · Trust Region Policy Optimization · Dense Connections · Entropy Regularization · Convolution · Experience Replay · ACER
