TL;DR
This paper introduces Light GRU, a simplified and more efficient RNN variant for speech recognition that removes redundancy and employs ReLU activations, leading to faster training and improved accuracy in noisy conditions.
Contribution
The paper proposes a novel Light GRU architecture by removing the reset gate and replacing tanh with ReLU, enhancing efficiency and performance for ASR tasks.
Findings
Reduces training time by over 30% compared to standard GRU.
Improves recognition accuracy across various tasks and noise conditions.
Effective in both traditional and end-to-end speech recognition models.
Abstract
A field that has directly benefited from the recent advances in deep learning is Automatic Speech Recognition (ASR). Despite the great achievements of the past decades, however, a natural and robust human-machine speech interaction still appears to be out of reach, especially in challenging environments characterized by significant noise and reverberation. To improve robustness, modern speech recognizers often employ acoustic models based on Recurrent Neural Networks (RNNs), that are naturally able to exploit large time contexts and long-term speech modulations. It is thus of great interest to continue the study of proper techniques for improving the effectiveness of RNNs in processing speech signals. In this paper, we revise one of the most popular RNN models, namely Gated Recurrent Units (GRUs), and propose a simplified architecture that turned out to be very effective for ASR. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Gated Recurrent Unit
