Small-footprint Deep Neural Networks with Highway Connections for Speech Recognition
Liang Lu, Steve Renals

TL;DR
This paper explores the use of highway neural networks to create smaller, deeper DNNs for speech recognition, achieving high accuracy with fewer parameters suitable for resource-constrained devices.
Contribution
It demonstrates that highway connections enable training of compact, deep DNNs that outperform traditional shallow models in speech recognition tasks.
Findings
Highway DNNs outperform plain DNNs on AMI corpus.
Significant reduction in model parameters without accuracy loss.
Effective for resource-limited speech recognition applications.
Abstract
For speech recognition, deep neural networks (DNNs) have significantly improved the recognition accuracy in most of benchmark datasets and application domains. However, compared to the conventional Gaussian mixture models, DNN-based acoustic models usually have much larger number of model parameters, making it challenging for their applications in resource constrained platforms, e.g., mobile devices. In this paper, we study the application of the recently proposed highway network to train small-footprint DNNs, which are {\it thinner} and {\it deeper}, and have significantly smaller number of model parameters compared to conventional DNNs. We investigated this approach on the AMI meeting speech transcription corpus which has around 70 hours of audio data. The highway neural networks constantly outperformed their plain DNN counterparts, and the number of model parameters can be reduced…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing
MethodsSigmoid Activation · Highway Layer · Highway Network
