Confidence Estimation for Black Box Automatic Speech Recognition Systems   Using Lattice Recurrent Neural Networks

Alexandros Kastanos; Anton Ragni; Mark Gales

arXiv:1910.11933·eess.AS·March 17, 2020

Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks

Alexandros Kastanos, Anton Ragni, Mark Gales

PDF

2 Repos

TL;DR

This paper proposes an extension of lattice recurrent neural networks to incorporate sub-word information for confidence estimation in black box speech recognition systems, significantly improving reliability assessments.

Contribution

It introduces a novel lattice RNN model that leverages sub-word data to enhance confidence scoring in black box ASR systems, addressing a key limitation.

Findings

01

Significant improvement in confidence estimation accuracy

02

Effective use of sub-word information in lattice RNNs

03

Validated on IARPA OpenKWS 2016 dataset

Abstract

Recently, there has been growth in providers of speech transcription services enabling others to leverage technology they would not normally be able to use. As a result, speech-enabled solutions have become commonplace. Their success critically relies on the quality, accuracy, and reliability of the underlying speech transcription systems. Those black box systems, however, offer limited means for quality control as only word sequences are typically available. This paper examines this limited resource scenario for confidence estimation, a measure commonly used to assess transcription reliability. In particular, it explores what other sources of word and sub-word level information available in the transcription process could be used to improve confidence scores. To encode all such information this paper extends lattice recurrent neural networks to handle sub-words. Experimental results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.