Golos: Russian Dataset for Speech Research

Nikolay Karpov; Alexander Denisenko; Fedor Minkin

arXiv:2106.10161·eess.AS·June 21, 2021·Interspeech

Golos: Russian Dataset for Speech Research

Nikolay Karpov, Alexander Denisenko, Fedor Minkin

PDF

2 Repos 2 Models 4 Datasets

TL;DR

This paper presents Golos, a large, freely available Russian speech dataset with 1240 hours of annotated audio, along with an acoustic model and transfer learning techniques, achieving low word error rates for speech recognition.

Contribution

Introduction of Golos, a comprehensive Russian speech dataset, and development of an acoustic model with transfer learning for improved speech recognition performance.

Findings

01

Golos dataset contains approximately 1240 hours of annotated speech.

02

Achieved a word error rate of about 3.3% with the acoustic model.

03

Transfer learning improved the model's accuracy on the dataset.

Abstract

This paper introduces a novel Russian speech dataset called Golos, a large corpus suitable for speech research. The dataset mainly consists of recorded audio files manually annotated on the crowd-sourcing platform. The total duration of the audio is about 1240 hours. We have made the corpus freely available to download, along with the acoustic model with CTC loss prepared on this corpus. Additionally, transfer learning was applied to improve the performance of the acoustic model. In order to evaluate the quality of the dataset with the beam-search algorithm, we have built a 3-gram language model on the open Common Crawl dataset. The total word error rate (WER) metrics turned out to be about 3.3% and 11.5%.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsConnectionist Temporal Classification Loss