Word-Level ASR Quality Estimation for Efficient Corpus Sampling and Post-Editing through Analyzing Attentions of a Reference-Free Metric
Golara Javadi, Kamer Ali Yuksel, Yunsu Kim, Thiago Castro Ferreira,, Mohamed Al-Badrashiny

TL;DR
This paper introduces NoRefER, a reference-free quality estimation metric for ASR that improves error detection, dataset annotation, and model transparency, aiding post-editing and system fine-tuning.
Contribution
The study presents NoRefER as a novel, explainable, reference-free metric for word-level ASR error estimation, enhancing transparency and efficiency in post-editing and corpus building.
Findings
NoRefER effectively identifies word errors in ASR outputs.
It improves dataset annotation for training and evaluation.
The metric enhances model interpretability and post-editing workflows.
Abstract
In the realm of automatic speech recognition (ASR), the quest for models that not only perform with high accuracy but also offer transparency in their decision-making processes is crucial. The potential of quality estimation (QE) metrics is introduced and evaluated as a novel tool to enhance explainable artificial intelligence (XAI) in ASR systems. Through experiments and analyses, the capabilities of the NoRefER (No Reference Error Rate) metric are explored in identifying word-level errors to aid post-editors in refining ASR hypotheses. The investigation also extends to the utility of NoRefER in the corpus-building process, demonstrating its effectiveness in augmenting datasets with insightful annotations. The diagnostic aspects of NoRefER are examined, revealing its ability to provide valuable insights into model behaviors and decision patterns. This has proven beneficial for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech Recognition and Synthesis · Natural Language Processing Techniques
