AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and   General Domain ASR

Tobi Olatunji; Tejumade Afonja; Aditya Yadavalli; Chris Chinenye; Emezue; Sahib Singh; Bonaventure F.P. Dossou; Joanne Osuchukwu; Salomey Osei,; Atnafu Lambebo Tonja; Naome Etori; Clinton Mbataku

arXiv:2310.00274·cs.CL·October 3, 2023·1 cites

AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR

Tobi Olatunji, Tejumade Afonja, Aditya Yadavalli, Chris Chinenye, Emezue, Sahib Singh, Bonaventure F.P. Dossou, Joanne Osuchukwu, Salomey Osei,, Atnafu Lambebo Tonja, Naome Etori, Clinton Mbataku

PDF

Open Access 2 Models 1 Datasets

TL;DR

This paper introduces AfriSpeech, a comprehensive 200-hour Pan-African English speech dataset with diverse accents, aimed at advancing clinical and general domain automatic speech recognition (ASR) for African languages and addressing racial bias in speech technology.

Contribution

It provides the first large-scale, publicly available African accented speech dataset for clinical and general ASR, along with benchmark models and evaluation resources.

Findings

01

Achieved state-of-the-art performance on AfriSpeech benchmark

02

Demonstrated significant performance gaps for African accents in existing ASR systems

03

Provided publicly available pre-trained models for African accented speech recognition

Abstract

Africa has a very low doctor-to-patient ratio. At very busy clinics, doctors could see 30+ patients per day -- a heavy patient burden compared with developed countries -- but productivity tools such as clinical automatic speech recognition (ASR) are lacking for these overworked clinicians. However, clinical ASR is mature, even ubiquitous, in developed nations, and clinician-reported performance of commercial clinical ASR systems is generally satisfactory. Furthermore, the recent performance of general domain ASR is approaching human accuracy. However, several gaps exist. Several publications have highlighted racial bias with speech-to-text algorithms and performance on minority accents lags significantly. To our knowledge, there is no publicly available research or benchmark on accented African clinical ASR, and speech data is non-existent for the majority of African accents. We release…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

intronhealth/afrispeech-200
dataset· 992 dl
992 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInterpreting and Communication in Healthcare