Measuring the Effect of Transcription Noise on Downstream Language Understanding Tasks
Ori Shapira, Shlomo E. Chazan, Amir DN Cohen

TL;DR
This paper introduces a flexible framework to systematically analyze how transcription errors from speech recognition impact downstream NLP tasks, revealing models' varying tolerance to noise and error types.
Contribution
It presents a novel configurable framework for evaluating the effects of transcription noise on downstream tasks and models, aiding the development of robust spoken language understanding solutions.
Findings
Task models tolerate some noise levels
Different error types affect models differently
Framework helps in designing noise-resilient SLU systems
Abstract
With the increasing prevalence of recorded human speech, spoken language understanding (SLU) is essential for its efficient processing. In order to process the speech, it is commonly transcribed using automatic speech recognition technology. This speech-to-text transition introduces errors into the transcripts, which subsequently propagate to downstream NLP tasks, such as dialogue summarization. While it is known that transcript noise affects downstream tasks, a systematic approach to analyzing its effects across different noise severities and types has not been addressed. We propose a configurable framework for assessing task models in diverse noisy settings, and for examining the impact of transcript-cleaning techniques. The framework facilitates the investigation of task model behavior, which can in turn support the development of effective SLU solutions. We exemplify the utility of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSpeech and dialogue systems
