THCHS-30 : A Free Chinese Speech Corpus

Dong Wang; Xuewei Zhang

arXiv:1512.01882·cs.CL·December 11, 2015·191 cites

THCHS-30 : A Free Chinese Speech Corpus

Dong Wang, Xuewei Zhang

PDF

Open Access 1 Repo 4 Models

TL;DR

This paper introduces THCHS-30, a free Chinese speech corpus designed to support speech recognition research, especially for newcomers, and reports baseline performance including noisy conditions.

Contribution

It provides a publicly available Chinese speech dataset, facilitating accessible research and development in Chinese speech recognition.

Findings

01

Baseline system performance established

02

System tested under noisy conditions

03

Dataset supports full Chinese speech recognition development

Abstract

Speech data is crucially important for speech recognition research. There are quite some speech databases that can be purchased at prices that are reasonable for most research institutes. However, for young people who just start research activities or those who just gain initial interest in this direction, the cost for data is still an annoying barrier. We support the `free data' movement in speech recognition: research institutes (particularly supported by public funds) publish their data freely so that new researchers can obtain sufficient data to kick of their career. In this paper, we follow this trend and release a free Chinese speech database THCHS-30 that can be used to build a full- edged Chinese speech recognition system. We report the baseline system established with this database, including the performance under highly noisy conditions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

foamliu/Listen-Attend-and-Spell
pytorch

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Music and Audio Processing