Continuous Pseudo-Labeling from the Start
Dan Berrebbi, Ronan Collobert, Samy Bengio, Navdeep Jaitly, Tatiana, Likhomanenko

TL;DR
This paper introduces a novel approach to self-training in automatic speech recognition that generates pseudo-labels from the very start of training, reducing overfitting and improving generalization.
Contribution
It demonstrates the feasibility of continuous pseudo-labeling from the beginning of training using dynamic curriculum control and sampling techniques, a first in ASR research.
Findings
Achieved comparable results to prior methods without external language models.
Controlled pseudo-label evolution improves model stability and generalization.
Sampling from the predictive distribution stabilizes training.
Abstract
Self-training (ST), or pseudo-labeling has sparked significant interest in the automatic speech recognition (ASR) community recently because of its success in harnessing unlabeled data. Unlike prior semi-supervised learning approaches that relied on iteratively regenerating pseudo-labels (PLs) from a trained model and using them to train a new model, recent state-of-the-art methods perform `continuous training' where PLs are generated using a very recent version of the model being trained. Nevertheless, these approaches still rely on bootstrapping the ST using an initial supervised learning phase where the model is trained on labeled data alone. We believe this has the potential for over-fitting to the labeled dataset in low resource settings and that ST from the start of training should reduce over-fitting. In this paper we show how we can do this by dynamically controlling the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Speech and dialogue systems
