Hallucinations in Neural Automatic Speech Recognition: Identifying   Errors and Hallucinatory Models

Rita Frieske; Bertram E. Shi

arXiv:2401.01572·cs.CL·January 4, 2024·2 cites

Hallucinations in Neural Automatic Speech Recognition: Identifying Errors and Hallucinatory Models

Rita Frieske, Bertram E. Shi

PDF

Open Access

TL;DR

This paper investigates hallucinations in neural automatic speech recognition (ASR), defining them as semantically unrelated yet fluent transcriptions, and proposes a novel perturbation-based method to identify and analyze these hallucinations without needing training data.

Contribution

It introduces a new framework for detecting hallucinations in ASR, demonstrating how to distinguish hallucinations from genuine outputs and how to induce hallucinations through noise injection.

Findings

01

Common metrics like WER cannot detect hallucinations.

02

The proposed perturbation method effectively identifies hallucinations.

03

Certain dataset noises are more likely to cause hallucinations.

Abstract

Hallucinations are a type of output error produced by deep neural networks. While this has been studied in natural language processing, they have not been researched previously in automatic speech recognition. Here, we define hallucinations in ASR as transcriptions generated by a model that are semantically unrelated to the source utterance, yet still fluent and coherent. The similarity of hallucinations to probable natural language outputs of the model creates a danger of deception and impacts the credibility of the system. We show that commonly used metrics, such as word error rates, cannot differentiate between hallucinatory and non-hallucinatory models. To address this, we propose a perturbation-based method for assessing the susceptibility of an automatic speech recognition (ASR) model to hallucination at test time, which does not require access to the training dataset. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Deception detection and forensic psychology · Traumatic Brain Injury Research