Acoustic word embeddings for zero-resource languages using   self-supervised contrastive learning and multilingual adaptation

Christiaan Jacobs; Yevgen Matusevych; Herman Kamper

arXiv:2103.10731·cs.CL·March 22, 2021

Acoustic word embeddings for zero-resource languages using self-supervised contrastive learning and multilingual adaptation

Christiaan Jacobs, Yevgen Matusevych, Herman Kamper

PDF

2 Repos

TL;DR

This paper explores using contrastive self-supervised learning and multilingual transfer to create effective acoustic word embeddings for zero-resource languages, improving word discrimination performance.

Contribution

It introduces a contrastive learning approach utilizing discovered terms for unsupervised and multilingual adaptation, outperforming previous methods in zero-resource language settings.

Findings

01

Contrastive self-supervision improves monolingual AWEs.

02

Multilingual AWE adaptation with contrastive learning outperforms previous models.

03

Best results achieved in word discrimination across six zero-resource languages.

Abstract

Acoustic word embeddings (AWEs) are fixed-dimensional representations of variable-length speech segments. For zero-resource languages where labelled data is not available, one AWE approach is to use unsupervised autoencoder-based recurrent models. Another recent approach is to use multilingual transfer: a supervised AWE model is trained on several well-resourced languages and then applied to an unseen zero-resource language. We consider how a recent contrastive learning loss can be used in both the purely unsupervised and multilingual transfer settings. Firstly, we show that terms from an unsupervised term discovery system can be used for contrastive self-supervision, resulting in improvements over previous unsupervised monolingual AWE models. Secondly, we consider how multilingual AWE models can be adapted to a specific zero-resource language using discovered terms. We find that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsContrastive Learning · Solana Customer Service Number +1-833-534-1729