Small-footprint Deep Neural Networks with Highway Connections for Speech   Recognition

Liang Lu; Steve Renals

arXiv:1512.04280·cs.CL·June 15, 2017·2 cites

Small-footprint Deep Neural Networks with Highway Connections for Speech Recognition

Liang Lu, Steve Renals

PDF

Open Access

TL;DR

This paper explores the use of highway neural networks to create smaller, deeper DNNs for speech recognition, achieving high accuracy with fewer parameters suitable for resource-constrained devices.

Contribution

It demonstrates that highway connections enable training of compact, deep DNNs that outperform traditional shallow models in speech recognition tasks.

Findings

01

Highway DNNs outperform plain DNNs on AMI corpus.

02

Significant reduction in model parameters without accuracy loss.

03

Effective for resource-limited speech recognition applications.

Abstract

For speech recognition, deep neural networks (DNNs) have significantly improved the recognition accuracy in most of benchmark datasets and application domains. However, compared to the conventional Gaussian mixture models, DNN-based acoustic models usually have much larger number of model parameters, making it challenging for their applications in resource constrained platforms, e.g., mobile devices. In this paper, we study the application of the recently proposed highway network to train small-footprint DNNs, which are {\it thinner} and {\it deeper}, and have significantly smaller number of model parameters compared to conventional DNNs. We investigated this approach on the AMI meeting speech transcription corpus which has around 70 hours of audio data. The highway neural networks constantly outperformed their plain DNN counterparts, and the number of model parameters can be reduced…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing

MethodsSigmoid Activation · Highway Layer · Highway Network