Bilevel Joint Unsupervised and Supervised Training for Automatic Speech Recognition
Xiaodong Cui, A F M Saif, Songtao Lu, Lisha Chen, Tianyi Chen, Brian, Kingsbury, George Saon

TL;DR
This paper introduces BL-JUST, a bilevel training framework that jointly optimizes unsupervised and supervised objectives for speech recognition, leading to better acoustic models than traditional pre-training methods.
Contribution
The paper presents a novel bilevel joint training approach for speech recognition that simultaneously optimizes unsupervised and supervised losses, improving model performance.
Findings
BL-JUST outperforms pre-training and fine-tuning strategies.
It achieves better results than other semi-supervised techniques.
The method balances generic and task-specific acoustic representations.
Abstract
In this paper, we propose a bilevel joint unsupervised and supervised training (BL-JUST) framework for automatic speech recognition. Compared to the conventional pre-training and fine-tuning strategy which is a disconnected two-stage process, BL-JUST tries to optimize an acoustic model such that it simultaneously minimizes both the unsupervised and supervised loss functions. Because BL-JUST seeks matched local optima of both loss functions, acoustic representations learned by the acoustic model strike a good balance between being generic and task-specific. We solve the BL-JUST problem using penalty-based bilevel gradient descent and evaluate the trained deep neural network acoustic models on various datasets with a variety of architectures and loss functions. We show that BL-JUST can outperform the widely-used pre-training and fine-tuning strategy and some other popular semi-supervised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis
