On Training Recurrent Networks with Truncated Backpropagation Through Time in Speech Recognition
Hao Tang, James Glass

TL;DR
This paper investigates how recurrent neural networks learn long-term dependencies in speech recognition, analyzing the effects of decoding strategies and training heuristics on their memory capabilities.
Contribution
It connects decoding approaches with training methods like truncated backpropagation through time, providing insights into the networks' ability to remember long-term dependencies.
Findings
Decoding approach influences the amount of history used for prediction.
Design choices affect the network's ability to capture long-term dependencies.
Connections between Markov processes and vanishing gradients are established.
Abstract
Recurrent neural networks have been the dominant models for many speech and language processing tasks. However, we understand little about the behavior and the class of functions recurrent networks can realize. Moreover, the heuristics used during training complicate the analyses. In this paper, we study recurrent networks' ability to learn long-term dependency in the context of speech recognition. We consider two decoding approaches, online and batch decoding, and show the classes of functions to which the decoding approaches correspond. We then draw a connection between batch decoding and a popular training approach for recurrent networks, truncated backpropagation through time. Changing the decoding approach restricts the amount of past history recurrent networks can use for prediction, allowing us to analyze their ability to remember. Empirically, we utilize long-term dependency in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
