Gated Recurrent Unit (GRU) for Emotion Classification from Noisy Speech
Rajib Rana

TL;DR
This paper investigates the effectiveness of Gated Recurrent Units (GRU) for emotion classification from noisy speech, demonstrating comparable performance to LSTM with reduced runtime, suitable for mobile applications.
Contribution
It is the first study to explore GRU for emotion recognition from noisy speech, highlighting its efficiency and potential for real-time mobile deployment.
Findings
GRU achieves 18.16% faster runtime than LSTM.
GRU performs comparably to LSTM in noisy conditions.
Results support GRU's suitability for embedded emotion recognition systems.
Abstract
Despite the enormous interest in emotion classification from speech, the impact of noise on emotion classification is not well understood. This is important because, due to the tremendous advancement of the smartphone technology, it can be a powerful medium for speech emotion recognition in the outside laboratory natural environment, which is likely to incorporate background noise in the speech. We capitalize on the current breakthrough of Recurrent Neural Network (RNN) and seek to investigate its performance for emotion classification from noisy speech. We particularly focus on the recently proposed Gated Recurrent Unit (GRU), which is yet to be explored for emotion recognition from speech. Experiments conducted with speech compounded with eight different types of noises reveal that GRU incurs an 18.16% smaller run-time while performing quite comparably to the Long Short-Term Memory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing
