Speech Recognition Challenge in the Wild: Arabic MGB-3

Ahmed Ali; Stephan Vogel; Steve Renals

arXiv:1709.07276·cs.CL·September 22, 2017

Speech Recognition Challenge in the Wild: Arabic MGB-3

Ahmed Ali, Stephan Vogel, Steve Renals

PDF

1 Repo 4 Datasets

TL;DR

The Arabic MGB-3 Challenge advances speech recognition in dialectal Arabic across diverse genres, introducing dialect identification, and reports on system performances from thirteen participating teams.

Contribution

This paper introduces the Arabic MGB-3 Challenge focusing on dialectal Arabic speech recognition and dialect identification across multiple genres, with detailed evaluation results.

Findings

01

Thirteen teams participated with ten systems submitted.

02

Significant progress in dialectal Arabic speech recognition achieved.

03

Effective dialect identification methods demonstrated.

Abstract

This paper describes the Arabic MGB-3 Challenge - Arabic Speech Recognition in the Wild. Unlike last year's Arabic MGB-2 Challenge, for which the recognition task was based on more than 1,200 hours broadcast TV news recordings from Aljazeera Arabic TV programs, MGB-3 emphasises dialectal Arabic using a multi-genre collection of Egyptian YouTube videos. Seven genres were used for the data collection: comedy, cooking, family/kids, fashion, drama, sports, and science (TEDx). A total of 16 hours of videos, split evenly across the different genres, were divided into adaptation, development and evaluation data sets. The Arabic MGB-Challenge comprised two tasks: A) Speech transcription, evaluated on the MGB-3 test set, along with the 10 hour MGB-2 test set to report progress on the MGB-2 evaluation; B) Arabic dialect identification, introduced this year in order to distinguish between four…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qcri/dialectID
noneOfficial

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.