Exploiting Cross-Lingual Knowledge in Unsupervised Acoustic Modeling for Low-Resource Languages
Siyuan Feng

TL;DR
This paper investigates unsupervised acoustic modeling for zero-resource speech recognition, emphasizing cross-lingual knowledge transfer to improve subword discovery and representation in low-resource languages.
Contribution
It introduces methods leveraging cross-lingual knowledge to enhance unsupervised subword discovery and feature learning in zero-resource speech recognition.
Findings
Cross-lingual transfer improves subword unit discovery.
Unsupervised features become more linguistically discriminative.
Enhanced robustness to non-linguistic factors achieved.
Abstract
(Short version of Abstract) This thesis describes an investigation on unsupervised acoustic modeling (UAM) for automatic speech recognition (ASR) in the zero-resource scenario, where only untranscribed speech data is assumed to be available. UAM is not only important in addressing the general problem of data scarcity in ASR technology development but also essential to many non-mainstream applications, for examples, language protection, language acquisition and pathological speech assessment. The present study is focused on two research problems. The first problem concerns unsupervised discovery of basic (subword level) speech units in a given language. Under the zero-resource condition, the speech units could be inferred only from the acoustic signals, without requiring or involving any linguistic direction and/or constraints. The second problem is referred to as unsupervised subword…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Music and Audio Processing
