Hallucinations in Neural Automatic Speech Recognition: Identifying Errors and Hallucinatory Models
Rita Frieske, Bertram E. Shi

TL;DR
This paper investigates hallucinations in neural automatic speech recognition (ASR), defining them as semantically unrelated yet fluent transcriptions, and proposes a novel perturbation-based method to identify and analyze these hallucinations without needing training data.
Contribution
It introduces a new framework for detecting hallucinations in ASR, demonstrating how to distinguish hallucinations from genuine outputs and how to induce hallucinations through noise injection.
Findings
Common metrics like WER cannot detect hallucinations.
The proposed perturbation method effectively identifies hallucinations.
Certain dataset noises are more likely to cause hallucinations.
Abstract
Hallucinations are a type of output error produced by deep neural networks. While this has been studied in natural language processing, they have not been researched previously in automatic speech recognition. Here, we define hallucinations in ASR as transcriptions generated by a model that are semantically unrelated to the source utterance, yet still fluent and coherent. The similarity of hallucinations to probable natural language outputs of the model creates a danger of deception and impacts the credibility of the system. We show that commonly used metrics, such as word error rates, cannot differentiate between hallucinatory and non-hallucinatory models. To address this, we propose a perturbation-based method for assessing the susceptibility of an automatic speech recognition (ASR) model to hallucination at test time, which does not require access to the training dataset. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Deception detection and forensic psychology · Traumatic Brain Injury Research
