Improving Low Resource Code-switched ASR using Augmented Code-switched   TTS

Yash Sharma; Basil Abraham; Karan Taneja; Preethi Jyothi

arXiv:2010.05549·cs.CL·October 13, 2020

Improving Low Resource Code-switched ASR using Augmented Code-switched TTS

Yash Sharma, Basil Abraham, Karan Taneja, Preethi Jyothi

PDF

TL;DR

This paper enhances low-resource code-switched ASR by using data augmentation with TTS synthesis, applying Mixup and a new loss function to improve performance and code-switching detection.

Contribution

It introduces two novel techniques—Mixup and a specialized loss function—for leveraging TTS data to improve low-resource code-switched ASR systems.

Findings

01

Up to 5% absolute WER reduction achieved.

02

Significant improvement in code-switching detection.

03

Effective use of TTS for data augmentation in low-resource settings.

Abstract

Building Automatic Speech Recognition (ASR) systems for code-switched speech has recently gained renewed attention due to the widespread use of speech technologies in multilingual communities worldwide. End-to-end ASR systems are a natural modeling choice due to their ease of use and superior performance in monolingual settings. However, it is well known that end-to-end systems require large amounts of labeled speech. In this work, we investigate improving code-switched ASR in low resource settings via data augmentation using code-switched text-to-speech (TTS) synthesis. We propose two targeted techniques to effectively leverage TTS speech samples: 1) Mixup, an existing technique to create new training samples via linear interpolation of existing samples, applied to TTS and real speech samples, and 2) a new loss function, used in conjunction with TTS samples, to encourage code-switched…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMixup