Toward Zero Oracle Word Error Rate on the Switchboard Benchmark
Arlo Faria, Adam Janin, Korbinian Riedhammer, Sidhi Adkoli

TL;DR
This paper improves the evaluation of speech recognition systems on the Switchboard benchmark by correcting references and scoring methods, achieving near-zero oracle WER and surpassing human performance with research systems.
Contribution
It introduces a refined evaluation scheme, alternative metrics, and oracle WER computation methods that significantly lower error rates and better discriminate between human and machine performance.
Findings
Commercial systems can score below 5% WER with corrected references.
Research systems surpass commercial human speech recognition accuracy.
Oracle WER can be reduced to 0.18% using dense lattices and alternative data structures.
Abstract
The "Switchboard benchmark" is a very well-known test set in automatic speech recognition (ASR) research, establishing record-setting performance for systems that claim human-level transcription accuracy. This work highlights lesser-known practical considerations of this evaluation, demonstrating major improvements in word error rate (WER) by correcting the reference transcriptions and deviating from the official scoring methodology. In this more detailed and reproducible scheme, even commercial ASR systems can score below 5% WER and the established record for a research system is lowered to 2.3%. An alternative metric of transcript precision is proposed, which does not penalize deletions and appears to be more discriminating for human vs. machine performance. While commercial ASR systems are still below this threshold, a research system is shown to clearly surpass the accuracy of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Network Packet Processing and Optimization
MethodsTest
