Guided contrastive self-supervised pre-training for automatic speech recognition
Aparna Khare, Minhua Wu, Saurabhchand Bhati, Jasha Droppo, Roland Maas

TL;DR
This paper introduces Guided Contrastive Predictive Coding (GCPC), a novel pre-training method for automatic speech recognition that incorporates prior knowledge to improve model performance across multiple languages.
Contribution
The paper proposes GCPC, a new contrastive learning approach that injects prior knowledge during pre-training to enhance speech recognition accuracy.
Findings
GCPC outperforms CPC pre-training on all three datasets.
Reduces Word Error Rate (WER) by up to 15.43%.
Demonstrates effectiveness across German, French, and English ASR tasks.
Abstract
Contrastive Predictive Coding (CPC) is a representation learning method that maximizes the mutual information between intermediate latent representations and the output of a given model. It can be used to effectively initialize the encoder of an Automatic Speech Recognition (ASR) model. We present a novel modification of CPC called Guided Contrastive Predictive Coding (GCPC). Our proposed method maximizes the mutual information between representations from a prior-knowledge model and the output of the model being pre-trained, allowing prior knowledge injection during pre-training. We validate our method on 3 ASR tasks: German, French and English. Our method outperforms CPC pre-training on all three datasets, reducing the Word Error Rate (WER) by 4.44%, 6.55% and 15.43% relative on the German, French and English (Librispeech) tasks respectively, compared to training from scratch, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Music and Audio Processing
MethodsInfoNCE · Contrastive Predictive Coding
