DARTS: Dialectal Arabic Transcription System

Sameer Khurana; Ahmed Ali; James Glass

arXiv:1909.12163·cs.CL·September 27, 2019·5 cites

DARTS: Dialectal Arabic Transcription System

Sameer Khurana, Ahmed Ali, James Glass

PDF

Open Access

TL;DR

This paper introduces DARTS, a speech-to-text system for Egyptian Arabic dialect, utilizing transfer learning and semi-supervised learning with YouTube data to improve transcription accuracy in low-resource settings.

Contribution

The paper presents a novel speech transcription system for Egyptian Arabic dialect that combines transfer learning and semi-supervised learning to enhance performance in low-resource conditions.

Findings

01

Transfer learning yields good results in low-resource dialect transcription.

02

Semi-supervised learning with YouTube data further improves accuracy.

03

The combined system achieves the lowest word error rate on the MGB-3 dataset.

Abstract

We present the speech to text transcription system, called DARTS, for low resource Egyptian Arabic dialect. We analyze the following; transfer learning from high resource broadcast domain to low-resource dialectal domain and semi-supervised learning where we use in-domain unlabeled audio data collected from YouTube. Key features of our system are: A deep neural network acoustic model that consists of a front end Convolutional Neural Network (CNN) followed by several layers of Time Delayed Neural Network (TDNN) and Long-Short Term Memory Recurrent Neural Network (LSTM); sequence discriminative training of the acoustic model; n-gram and recurrent neural network language model for decoding and N-best list rescoring. We show that a simple transfer learning method can achieve good results. The results are further improved by using unlabeled data from YouTube in a semi-supervised setup.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Music and Audio Processing

MethodsDifferentiable Architecture Search