Cantonese Automatic Speech Recognition Using Transfer Learning from   Mandarin

Bryan Li; Xinyue Wang; Homayoon Beigi

arXiv:1911.09271·cs.CL·November 22, 2019·5 cites

Cantonese Automatic Speech Recognition Using Transfer Learning from Mandarin

Bryan Li, Xinyue Wang, Homayoon Beigi

PDF

Open Access

TL;DR

This paper introduces a transfer learning approach from Mandarin to Cantonese for automatic speech recognition, enabling faster training and improved accuracy in low-resource Cantonese ASR systems.

Contribution

The study demonstrates effective transfer learning techniques for Cantonese ASR using Mandarin models, reducing training time and improving recognition accuracy.

Findings

01

Transfer learning reduces training time for Cantonese ASR.

02

Transfer models achieve lower log-probability per epoch.

03

Slight CER improvements observed with transfer learning.

Abstract

We propose a system to develop a basic automatic speech recognizer(ASR) for Cantonese, a low-resource language, through transfer learning of Mandarin, a high-resource language. We take a time-delayed neural network trained on Mandarin, and perform weight transfer of several layers to a newly initialized model for Cantonese. We experiment with the number of layers transferred, their learning rates, and pretraining i-vectors. Key findings are that this approach allows for quicker training time with less data. We find that for every epoch, log-probability is smaller for transfer learning models compared to a Cantonese-only model. The transfer learning models show slight improvement in CER.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Speech and Audio Processing