AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline
Hui Bu, Jiayu Du, Xingyu Na, Bengu Wu, Hao Zheng

TL;DR
AISHELL-1 is a large open-source Mandarin speech corpus with detailed recording procedures, resources, and a baseline Kaldi recipe, facilitating Mandarin speech recognition research and system development.
Contribution
This paper introduces AISHELL-1, the largest open-source Mandarin speech corpus, with comprehensive resources and a baseline system for speech recognition research.
Findings
High-quality audio recordings and transcriptions
Suitable for Mandarin speech recognition research
Provides a Kaldi recipe for baseline experiments
Abstract
An open-source Mandarin speech corpus called AISHELL-1 is released. It is by far the largest corpus which is suitable for conducting the speech recognition research and building speech recognition systems for Mandarin. The recording procedure, including audio capturing devices and environments are presented in details. The preparation of the related resources, including transcriptions and lexicon are described. The corpus is released with a Kaldi recipe. Experimental results implies that the quality of audio recordings and transcriptions are promising.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing
