Understanding Self-Supervised Learning of Speech Representation via Invariance and Redundancy Reduction
Yusuf Brima, Ulf Krumnack, Simone Pika, Gunther Heidemann

TL;DR
This paper empirically analyzes Barlow Twins, a self-supervised learning method for speech, highlighting its benefits in transferability and sample efficiency, while discussing its limitations in disentangling learned representations.
Contribution
It provides an empirical evaluation of Barlow Twins for speech, revealing its strengths and limitations, and suggests directions for incorporating additional priors to improve hierarchical representations.
Findings
Barlow Twins accelerates learning in downstream speech tasks.
Representations transfer effectively across domains.
Redundancy reduction alone is insufficient for full factorization.
Abstract
Self-supervised learning (SSL) has emerged as a promising paradigm for learning flexible speech representations from unlabeled data. By designing pretext tasks that exploit statistical regularities, SSL models can capture useful representations that are transferable to downstream tasks. This study provides an empirical analysis of Barlow Twins (BT), an SSL technique inspired by theories of redundancy reduction in human perception. On downstream tasks, BT representations accelerated learning and transferred across domains. However, limitations exist in disentangling key explanatory factors, with redundancy reduction and invariance alone insufficient for factorization of learned latents into modular, compact, and informative codes. Our ablations study isolated gains from invariance constraints, but the gains were context-dependent. Overall, this work substantiates the potential of Barlow…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis
MethodsBarlow Twins
