Improved acoustic word embeddings for zero-resource languages using   multilingual transfer

Herman Kamper; Yevgen Matusevych; Sharon Goldwater

arXiv:2006.02295·cs.CL·February 8, 2021

Improved acoustic word embeddings for zero-resource languages using multilingual transfer

Herman Kamper, Yevgen Matusevych, Sharon Goldwater

PDF

1 Repo

TL;DR

This paper introduces multilingual transfer methods to improve acoustic word embeddings for zero-resource languages, enabling better speech search and discovery without labeled data.

Contribution

It compares three multilingual RNN models trained on multiple languages, demonstrating significant improvements over unsupervised methods in zero-resource scenarios.

Findings

01

All models outperform state-of-the-art unsupervised models by over 30% in average precision.

02

The CAE model encodes more phonetic and speaker information than other models.

03

More training languages generally improve embedding quality, with diminishing returns.

Abstract

Acoustic word embeddings are fixed-dimensional representations of variable-length speech segments. Such embeddings can form the basis for speech search, indexing and discovery systems when conventional speech recognition is not possible. In zero-resource settings where unlabelled speech is the only available resource, we need a method that gives robust embeddings on an arbitrary language. Here we explore multilingual transfer: we train a single supervised embedding model on labelled data from multiple well-resourced languages and then apply it to unseen zero-resource languages. We consider three multilingual recurrent neural network (RNN) models: a classifier trained on the joint vocabularies of all training languages; a Siamese RNN trained to discriminate between same and different words from multiple languages; and a correspondence autoencoder (CAE) RNN trained to reconstruct word…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kamperh/globalphone_awe
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSolana Customer Service Number +1-833-534-1729