# Explanations for Automatic Speech Recognition

**Authors:** Xiaoliang Wu, Peter Bell, Ajitha Rajan

arXiv: 2302.14062 · 2023-03-01

## TL;DR

This paper develops methods to explain neural network-based automatic speech recognition (ASR) outputs by identifying minimal audio segments responsible for transcriptions, enhancing system understanding and trust.

## Contribution

It adapts explainable AI techniques like SFL, Causal, and LIME for sequence-based ASR explanations, addressing the challenge of variable-length transcriptions.

## Key findings

- Proposed explanations identify minimal audio causes of transcriptions.
- Evaluations show effective explanations across multiple ASR systems.
- Baseline LIME method provides a comparative benchmark.

## Abstract

We address quality assessment for neural network based ASR by providing explanations that help increase our understanding of the system and ultimately help build trust in the system. Compared to simple classification labels, explaining transcriptions is more challenging as judging their correctness is not straightforward and transcriptions as a variable-length sequence is not handled by existing interpretable machine learning models. We provide an explanation for an ASR transcription as a subset of audio frames that is both a minimal and sufficient cause of the transcription. To do this, we adapt existing explainable AI (XAI) techniques from image classification-Statistical Fault Localisation(SFL) and Causal. Additionally, we use an adapted version of Local Interpretable Model-Agnostic Explanations (LIME) for ASR as a baseline in our experiments. We evaluate the quality of the explanations generated by the proposed techniques over three different ASR ,Google API, the baseline model of Sphinx, Deepspeech and 100 audio samples from the Commonvoice dataset.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.14062/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/2302.14062/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/2302.14062/full.md

---
Source: https://tomesphere.com/paper/2302.14062