1000 African Voices: Advancing inclusive multi-speaker multi-accent   speech synthesis

Sewade Ogun; Abraham T. Owodunni; Tobi Olatunji; Eniola Alese,; Babatunde Oladimeji; Tejumade Afonja; Kayode Olaleye; Naome A. Etori; Tosin; Adewumi

arXiv:2406.11727·eess.AS·June 28, 2024

1000 African Voices: Advancing inclusive multi-speaker multi-accent speech synthesis

Sewade Ogun, Abraham T. Owodunni, Tobi Olatunji, Eniola Alese,, Babatunde Oladimeji, Tejumade Afonja, Kayode Olaleye, Naome A. Etori, Tosin, Adewumi

PDF

Open Access 1 Models

TL;DR

This paper introduces Afro-TTS, a speech synthesis system that generates diverse African accented English voices, enhancing representation and inclusivity in speech technology across 86 accents with 1000 personas.

Contribution

The paper presents Afro-TTS, the first multi-accent African English speech synthesis system with 1000 personas across 86 accents, improving diversity and naturalness in speech synthesis.

Findings

01

Successfully synthesized 86 African accents

02

Generated 1000 diverse personas for speech synthesis

03

Maintained naturalness and accentedness through interpolation

Abstract

Recent advances in speech synthesis have enabled many useful applications like audio directions in Google Maps, screen readers, and automated content generation on platforms like TikTok. However, these systems are mostly dominated by voices sourced from data-rich geographies with personas representative of their source data. Although 3000 of the world's languages are domiciled in Africa, African voices and personas are under-represented in these systems. As speech synthesis becomes increasingly democratized, it is desirable to increase the representation of African English accents. We present Afro-TTS, the first pan-African accented English speech synthesis system able to generate speech in 86 African accents, with 1000 personas representing the rich phonological diversity across the continent for downstream application in Education, Public Health, and Automated Content Creation.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
intronhealth/afro-tts
model· 29 dl· ♡ 13
29 dl♡ 13

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and dialogue systems · Topic Modeling