Regularizing Contrastive Predictive Coding for Speech Applications
Saurabhchand Bhati, Jes\'us Villalba, Piotr \.Zelasko, Laureano, Moro-Velazquez, Najim Dehak

TL;DR
This paper introduces regularization techniques for Contrastive Predictive Coding in speech applications, improving unsupervised speech representations and reducing labeled data requirements.
Contribution
The paper proposes two novel regularization methods, Self-expressing constraint and Left-or-Right regularization, to enhance CPC for speech tasks.
Findings
Regularized CPC matches baseline performance with less data.
Regularization techniques improve ABX and phoneme classification results.
Methods are effective across monolingual, cross-lingual, and multilingual settings.
Abstract
Self-supervised methods such as Contrastive predictive Coding (CPC) have greatly improved the quality of the unsupervised representations. These representations significantly reduce the amount of labeled data needed for downstream task performance, such as automatic speech recognition. CPC learns representations by learning to predict future frames given current frames. Based on the observation that the acoustic information, e.g., phones, changes slower than the feature extraction rate in CPC, we propose regularization techniques that impose slowness constraints on the features. Here we propose two regularization techniques: Self-expressing constraint and Left-or-Right regularization. We evaluate the proposed model on ABX and linear phone classification tasks, acoustic unit discovery, and automatic speech recognition. The regularized CPC trained on 100 hours of unlabeled data matches…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
MethodsInfoNCE · Contrastive Predictive Coding
