Transferring Knowledge from a RNN to a DNN
William Chan, Nan Rosemary Ke, Ian Lane

TL;DR
This paper presents a method to transfer knowledge from a high-performing RNN to a smaller DNN for speech recognition, significantly improving the small DNN's accuracy on embedded systems.
Contribution
It introduces a knowledge transfer technique using soft alignments from RNNs to enhance small DNN performance in ASR tasks.
Findings
Small DNN achieved 3.93 WER on WSJ eval92.
Compared to baseline 4.54 WER, over 13% relative improvement.
Method enables efficient deployment on embedded systems.
Abstract
Deep Neural Network (DNN) acoustic models have yielded many state-of-the-art results in Automatic Speech Recognition (ASR) tasks. More recently, Recurrent Neural Network (RNN) models have been shown to outperform DNNs counterparts. However, state-of-the-art DNN and RNN models tend to be impractical to deploy on embedded systems with limited computational capacity. Traditionally, the approach for embedded platforms is to either train a small DNN directly, or to train a small DNN that learns the output distribution of a large DNN. In this paper, we utilize a state-of-the-art RNN to transfer knowledge to small DNN. We use the RNN model to generate soft alignments and minimize the Kullback-Leibler divergence against the small DNN. The small DNN trained on the soft RNN alignments achieved a 3.93 WER on the Wall Street Journal (WSJ) eval92 task compared to a baseline 4.54 WER or more than 13%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing
