SlimIPL: Language-Model-Free Iterative Pseudo-Labeling
Tatiana Likhomanenko, Qiantong Xu, Jacob Kahn, Gabriel Synnaeve, Ronan, Collobert

TL;DR
SlimIPL introduces a language-model-free iterative pseudo-labeling method for speech recognition, achieving high accuracy with less data and computational resources by re-generating hard labels without language models, especially effective in low-resource settings.
Contribution
The paper proposes slimIPL, a novel iterative pseudo-labeling approach that eliminates the need for language models, improves training stability, and reduces computational costs in semi-supervised speech recognition.
Findings
Achieves state-of-the-art results with 100 hours of labeled data without language models.
Requires 3.5-4x fewer resources to converge compared to other methods.
Performs competitively with self-supervised approaches using only 10 hours of labeled data.
Abstract
Recent results in end-to-end automatic speech recognition have demonstrated the efficacy of pseudo-labeling for semi-supervised models trained both with Connectionist Temporal Classification (CTC) and Sequence-to-Sequence (seq2seq) losses. Iterative Pseudo-Labeling (IPL), which continuously trains a single model using pseudo-labels iteratively re-generated as the model learns, has been shown to further improve performance in ASR. We improve upon the IPL algorithm: as the model learns, we propose to iteratively re-generate transcriptions with hard labels (the most probable tokens), that is, without a language model. We call this approach Language-Model-Free IPL (slimIPL) and give a resultant training setup for low-resource settings with CTC-based models. slimIPL features a dynamic cache for pseudo-labels which reduces sensitivity to changes in relabeling hyperparameters and results in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsIterative Pseudo-Labeling · Tanh Activation · Sigmoid Activation · Long Short-Term Memory · Sequence to Sequence
