Kaldi+PDNN: Building DNN-based ASR Systems with Kaldi and PDNN

Yajie Miao

arXiv:1401.6984·cs.LG·January 28, 2014·64 cites

Kaldi+PDNN: Building DNN-based ASR Systems with Kaldi and PDNN

Yajie Miao

PDF

Open Access

TL;DR

This paper presents open-source recipes for integrating deep neural networks with the Kaldi ASR toolkit, enabling the construction of various DNN-based speech recognition systems with ease.

Contribution

It introduces practical recipes combining Kaldi and PDNN for building DNN, CNN, and bottleneck feature ASR systems, facilitating easier adaptation to new datasets.

Findings

01

Successfully built DNN hybrid systems using Kaldi and PDNN

02

Enabled easy adaptation of recipes to different datasets

03

Demonstrated effectiveness on Switchboard 110-hour setup

Abstract

The Kaldi toolkit is becoming popular for constructing automated speech recognition (ASR) systems. Meanwhile, in recent years, deep neural networks (DNNs) have shown state-of-the-art performance on various ASR tasks. This document describes our open-source recipes to implement fully-fledged DNN acoustic modeling using Kaldi and PDNN. PDNN is a lightweight deep learning toolkit developed under the Theano environment. Using these recipes, we can build up multiple systems including DNN hybrid systems, convolutional neural network (CNN) systems and bottleneck feature systems. These recipes are directly based on the Kaldi Switchboard 110-hour setup. However, adapting them to new datasets is easy to achieve.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing