ASR Performance Prediction on Unseen Broadcast Programs using   Convolutional Neural Networks

Zied Elloumi; Laurent Besacier; Olivier Galibert; Juliette; Kahn; Benjamin Lecouteux

arXiv:1804.08477·cs.CL·April 24, 2018

ASR Performance Prediction on Unseen Broadcast Programs using Convolutional Neural Networks

Zied Elloumi, Laurent Besacier, Olivier Galibert, Juliette, Kahn, Benjamin Lecouteux

PDF

TL;DR

This paper introduces a CNN-based approach for predicting ASR performance on unseen broadcast programs, combining textual and signal features to improve WER prediction accuracy on a new French corpus.

Contribution

It presents a novel CNN-based method that effectively combines textual and signal inputs for ASR performance prediction, outperforming traditional regression models.

Findings

01

CNN approach outperforms regression baseline in WER prediction

02

Combining textual and signal features improves CNN performance

03

CNN accurately predicts WER distribution across speech recordings

Abstract

In this paper, we address a relatively new task: prediction of ASR performance on unseen broadcast programs. We first propose an heterogenous French corpus dedicated to this task. Two prediction approaches are compared: a state-of-the-art performance prediction based on regression (engineered features) and a new strategy based on convolutional neural networks (learnt features). We particularly focus on the combination of both textual (ASR transcription) and signal inputs. While the joint use of textual and signal features did not work for the regression baseline, the combination of inputs for CNNs leads to the best WER prediction performance. We also show that our CNN prediction remarkably predicts the WER distribution on a collection of speech recordings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.