ApproBiVT: Lead ASR Models to Generalize Better Using Approximated   Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging

Fangyuan Wang; Ming Hao; Yuhai Shi; Bo Xu

arXiv:2308.02870·cs.CL·August 8, 2023

ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging

Fangyuan Wang, Ming Hao, Yuhai Shi, Bo Xu

PDF

Open Access

TL;DR

This paper introduces ApproBiVT, a new approach for ASR model training that uses an approximated bias-variance tradeoff to guide early stopping and checkpoint averaging, leading to improved generalization and lower error rates.

Contribution

It proposes a novel bias-variance tradeoff-based method for early stopping and checkpoint averaging in ASR training, improving model performance.

Findings

01

Achieves 2.5%-3.7% CER reduction on AISHELL-1

02

Achieves 3.1%-4.6% CER reduction on AISHELL-2

03

Guided training improves generalization of ASR models

Abstract

The conventional recipe for Automatic Speech Recognition (ASR) models is to 1) train multiple checkpoints on a training set while relying on a validation set to prevent overfitting using early stopping and 2) average several last checkpoints or that of the lowest validation losses to obtain the final model. In this paper, we rethink and update the early stopping and checkpoint averaging from the perspective of the bias-variance tradeoff. Theoretically, the bias and variance represent the fitness and variability of a model and the tradeoff of them determines the overall generalization error. But, it's impractical to evaluate them precisely. As an alternative, we take the training loss and validation loss as proxies of bias and variance and guide the early stopping and checkpoint averaging using their tradeoff, namely an Approximated Bias-Variance Tradeoff (ApproBiVT). When evaluating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Speech and Audio Processing

MethodsEarly Stopping