Exploiting Cross-Lingual Knowledge in Unsupervised Acoustic Modeling for   Low-Resource Languages

Siyuan Feng

arXiv:2007.15074·eess.AS·July 31, 2020

Exploiting Cross-Lingual Knowledge in Unsupervised Acoustic Modeling for Low-Resource Languages

Siyuan Feng

PDF

Open Access

TL;DR

This paper investigates unsupervised acoustic modeling for zero-resource speech recognition, emphasizing cross-lingual knowledge transfer to improve subword discovery and representation in low-resource languages.

Contribution

It introduces methods leveraging cross-lingual knowledge to enhance unsupervised subword discovery and feature learning in zero-resource speech recognition.

Findings

01

Cross-lingual transfer improves subword unit discovery.

02

Unsupervised features become more linguistically discriminative.

03

Enhanced robustness to non-linguistic factors achieved.

Abstract

(Short version of Abstract) This thesis describes an investigation on unsupervised acoustic modeling (UAM) for automatic speech recognition (ASR) in the zero-resource scenario, where only untranscribed speech data is assumed to be available. UAM is not only important in addressing the general problem of data scarcity in ASR technology development but also essential to many non-mainstream applications, for examples, language protection, language acquisition and pathological speech assessment. The present study is focused on two research problems. The first problem concerns unsupervised discovery of basic (subword level) speech units in a given language. Under the zero-resource condition, the speech units could be inferred only from the acoustic signals, without requiring or involving any linguistic direction and/or constraints. The second problem is referred to as unsupervised subword…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Music and Audio Processing