Learning from Past Mistakes: Improving Automatic Speech Recognition   Output via Noisy-Clean Phrase Context Modeling

Prashanth Gurunath Shivakumar; Haoqi Li; Kevin Knight; Panayiotis; Georgiou

arXiv:1802.02607·cs.CL·July 1, 2019

Learning from Past Mistakes: Improving Automatic Speech Recognition Output via Noisy-Clean Phrase Context Modeling

Prashanth Gurunath Shivakumar, Haoqi Li, Kevin Knight, Panayiotis, Georgiou

PDF

1 Repo

TL;DR

This paper introduces a neural network-based error correction system for automatic speech recognition that models ASR as a noisy transformation channel, improving accuracy especially in challenging conditions.

Contribution

The work presents a novel phrase-based error correction approach that leverages long-term context and learns from aggregate module errors to invert common ASR mistakes.

Findings

01

Consistently improves ASR accuracy over baseline systems.

02

Effective in correcting out-of-vocabulary and out-of-domain errors.

03

Enhances performance even on highly optimized ASR systems.

Abstract

Automatic speech recognition (ASR) systems often make unrecoverable errors due to subsystem pruning (acoustic, language and pronunciation models); for example pruning words due to acoustics using short-term context, prior to rescoring with long-term context based on linguistics. In this work we model ASR as a phrase-based noisy transformation channel and propose an error correction system that can learn from the aggregate errors of all the independent modules constituting the ASR and attempt to invert those. The proposed system can exploit long-term context using a neural network language model and can better choose between existing ASR output possibilities as well as re-introduce previously pruned or unseen (out-of-vocabulary) phrases. It provides corrections under poorly performing ASR conditions without degrading any accurate transcriptions; such corrections are greater on top of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cassandra-lehmann/ensemble_methods_ASR_transcripts
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning