Measuring the Accuracy of Automatic Speech Recognition Solutions

Korbinian Kuhn; Verena Kersken; Benedikt Reuter; Niklas Egger,; Gottfried Zimmermann

arXiv:2408.16287·cs.CL·August 30, 2024

Measuring the Accuracy of Automatic Speech Recognition Solutions

Korbinian Kuhn, Verena Kersken, Benedikt Reuter, Niklas Egger,, Gottfried Zimmermann

PDF

1 Repo

TL;DR

This study evaluates the real-world accuracy of eleven popular ASR services in diverse conditions, revealing significant variability and lower reliability, especially in streaming scenarios, despite claims of near-human performance.

Contribution

It provides an independent, comprehensive assessment of ASR accuracy in educational settings, highlighting discrepancies between reported and actual performance.

Findings

01

Accuracy varies widely across ASR vendors.

02

Streaming ASR performs significantly worse than offline.

03

ASR reliability remains a concern despite recent advancements.

Abstract

For d/Deaf and hard of hearing (DHH) people, captioning is an essential accessibility tool. Significant developments in artificial intelligence (AI) mean that Automatic Speech Recognition (ASR) is now a part of many popular applications. This makes creating captions easy and broadly available - but transcription needs high levels of accuracy to be accessible. Scientific publications and industry report very low error rates, claiming AI has reached human parity or even outperforms manual transcription. At the same time the DHH community reports serious issues with the accuracy and reliability of ASR. There seems to be a mismatch between technical innovations and the real-life experience for people who depend on transcription. Independent and comprehensive data is needed to capture the state of ASR. We measured the performance of eleven common ASR services with recordings of Higher…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shuffle-project/asr-comparison
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.