Active Learning for Speech Recognition: the Power of Gradients

Jiaji Huang; Rewon Child; Vinay Rao; Hairong Liu; Sanjeev Satheesh,; Adam Coates

arXiv:1612.03226·cs.CL·December 13, 2016·48 cites

Active Learning for Speech Recognition: the Power of Gradients

Jiaji Huang, Rewon Child, Vinay Rao, Hairong Liu, Sanjeev Satheesh,, Adam Coates

PDF

Open Access

TL;DR

This paper explores gradient-based active learning for speech recognition, demonstrating that the Expected Gradient Length method effectively reduces labeling costs and improves accuracy by selecting the most informative samples.

Contribution

It provides a theoretical justification for EGL in speech recognition and empirically shows its effectiveness over confidence-based methods.

Findings

01

EGL reduces word errors by 11%

02

EGL cuts labeling samples by 50%

03

EGL selects novel, uncorrelated samples

Abstract

In training speech recognition systems, labeling audio clips can be expensive, and not all data is equally valuable. Active learning aims to label only the most informative samples to reduce cost. For speech recognition, confidence scores and other likelihood-based active learning methods have been shown to be effective. Gradient-based active learning methods, however, are still not well-understood. This work investigates the Expected Gradient Length (EGL) approach in active learning for end-to-end speech recognition. We justify EGL from a variance reduction perspective, and observe that EGL's measure of informativeness picks novel samples uncorrelated with confidence scores. Experimentally, we show that EGL can reduce word errors by 11\%, or alternatively, reduce the number of samples to label by 50\%, when compared to random sampling.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Speech Recognition and Synthesis · Algorithms and Data Compression