KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset

Saida Mussakhojayeva; Aigerim Janaliyeva; Almas Mirzakhmetov; Yerbolat; Khassanov; Huseyin Atakan Varol

arXiv:2104.08459·eess.AS·September 9, 2021

KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset

Saida Mussakhojayeva, Aigerim Janaliyeva, Almas Mirzakhmetov, Yerbolat, Khassanov, Huseyin Atakan Varol

PDF

1 Repo

TL;DR

This paper presents KazakhTTS, a comprehensive open-source dataset for Kazakh text-to-speech synthesis, enabling advancements in TTS applications for a low-resource language with high-quality models achieving MOS above 4.

Contribution

It introduces the first large-scale, publicly available Kazakh TTS dataset with baseline models and evaluation, supporting research and industry development in Kazakh speech synthesis.

Findings

01

Baseline TTS models achieve MOS above 4.

02

The dataset covers 93 hours of speech from two speakers.

03

The dataset and models are freely available for use.

Abstract

This paper introduces a high-quality open-source speech synthesis dataset for Kazakh, a low-resource language spoken by over 13 million people worldwide. The dataset consists of about 93 hours of transcribed audio recordings spoken by two professional speakers (female and male). It is the first publicly available large-scale dataset developed to promote Kazakh text-to-speech (TTS) applications in both academia and industry. In this paper, we share our experience by describing the dataset development procedures and faced challenges, and discuss important future directions. To demonstrate the reliability of our dataset, we built baseline end-to-end TTS models and evaluated them using the subjective mean opinion score (MOS) measure. Evaluation results show that the best TTS models trained on our dataset achieve MOS above 4 for both speakers, which makes them applicable for practical use.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

IS2AI/Kazakh_TTS
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.