Sampling strategies in Siamese Networks for unsupervised speech   representation learning

Rachid Riad; Corentin Dancette; Julien Karadayi; Neil Zeghidour,; Thomas Schatz; Emmanuel Dupoux

arXiv:1804.11297·cs.CL·August 24, 2018

Sampling strategies in Siamese Networks for unsupervised speech representation learning

Rachid Riad, Corentin Dancette, Julien Karadayi, Neil Zeghidour,, Thomas Schatz, Emmanuel Dupoux

PDF

2 Repos

TL;DR

This paper systematically examines how different sampling strategies, considering linguistic and speaker distributions, affect the performance of Siamese networks in unsupervised speech representation learning, leading to improved results.

Contribution

It highlights the importance of sampling procedures in Siamese networks and demonstrates that strategies considering Zipf's Law and speaker distribution significantly enhance learning.

Findings

01

Sampling strategies based on Zipf's Law improve performance.

02

Word frequency compression benefits learning across various training sizes.

03

Applying these strategies to unsupervised word pairs improves state-of-the-art results.

Abstract

Recent studies have investigated siamese network architectures for learning invariant speech representations using same-different side information at the word level. Here we investigate systematically an often ignored component of siamese networks: the sampling procedure (how pairs of same vs. different tokens are selected). We show that sampling strategies taking into account Zipf's Law, the distribution of speakers and the proportions of same and different pairs of words significantly impact the performance of the network. In particular, we show that word frequency compression improves learning across a large range of variations in number of training pairs. This effect does not apply to the same extent to the fully unsupervised setting, where the pairs of same-different words are obtained by spoken term discovery. We apply these results to pairs of words discovered using an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSiamese Network