Sequence-based Multi-lingual Low Resource Speech Recognition

Siddharth Dalmia; Ramon Sanabria; Florian Metze; Alan W. Black

arXiv:1802.07420·cs.CL·March 8, 2018

Sequence-based Multi-lingual Low Resource Speech Recognition

Siddharth Dalmia, Ramon Sanabria, Florian Metze, Alan W. Black

PDF

TL;DR

This paper demonstrates that end-to-end sequence models trained with CTC loss can effectively improve low-resource multilingual speech recognition and adapt to new languages with limited data.

Contribution

It shows the effectiveness of end-to-end multi-lingual training of sequence models for low-resource speech recognition and cross-lingual adaptation.

Findings

01

Over 6% absolute error rate reduction on Babel languages

02

Effective cross-lingual adaptation with 25% target data

03

Training on multiple languages benefits very low resource scenarios

Abstract

Techniques for multi-lingual and cross-lingual speech recognition can help in low resource scenarios, to bootstrap systems and enable analysis of new languages and domains. End-to-end approaches, in particular sequence-based techniques, are attractive because of their simplicity and elegance. While it is possible to integrate traditional multi-lingual bottleneck feature extractors as front-ends, we show that end-to-end multi-lingual training of sequence models is effective on context independent models trained using Connectionist Temporal Classification (CTC) loss. We show that our model improves performance on Babel languages by over 6% absolute in terms of word/phoneme error rate when compared to mono-lingual systems built in the same setting for these languages. We also show that the trained model can be adapted cross-lingually to an unseen language using just 25% of the target data.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.