Unsupervised Subword Modeling Using Autoregressive Pretraining and   Cross-Lingual Phone-Aware Modeling

Siyuan Feng; Odette Scharenborg

arXiv:2007.13002·eess.AS·October 30, 2020

Unsupervised Subword Modeling Using Autoregressive Pretraining and Cross-Lingual Phone-Aware Modeling

Siyuan Feng, Odette Scharenborg

PDF

TL;DR

This paper presents a novel unsupervised subword modeling approach combining autoregressive predictive coding and cross-lingual phone-aware neural networks, achieving state-of-the-art results with less training data.

Contribution

It introduces a two-stage bottleneck feature learning framework with cross-lingual labels, demonstrating robustness and efficiency in low-resource scenarios.

Findings

01

APC improves front-end feature pretraining effectiveness.

02

System outperforms state-of-the-art on Libri-light and ZeroSpeech 2017.

03

Less training data needed when using APC pretraining.

Abstract

This study addresses unsupervised subword modeling, i.e., learning feature representations that can distinguish subword units of a language. The proposed approach adopts a two-stage bottleneck feature (BNF) learning framework, consisting of autoregressive predictive coding (APC) as a front-end and a DNN-BNF model as a back-end. APC pretrained features are set as input features to a DNN-BNF model. A language-mismatched ASR system is used to provide cross-lingual phone labels for DNN-BNF model training. Finally, BNFs are extracted as the subword-discriminative feature representation. A second aim of this work is to investigate the robustness of our approach's effectiveness to different amounts of training data. The results on Libri-light and the ZeroSpeech 2017 databases show that APC is effective in front-end feature pretraining. Our whole system outperforms the state of the art on both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.