Noise robust speech emotion recognition with signal-to-noise ratio adapting speech enhancement
Yu-Wen Chen, Julia Hirschberg, Yu Tsao

TL;DR
This paper introduces NRSER, a noise-robust speech emotion recognition system that employs speech enhancement and SNR-level detection to improve accuracy in noisy environments and prevent false emotion predictions on noise-only signals.
Contribution
The study presents a novel NRSER system combining speech enhancement with SNR-level detection and waveform reconstitution for improved noise robustness in SER.
Findings
NRSER significantly improves emotion recognition accuracy in noisy conditions.
The SNR-level detection can be used for effective data selection.
NRSER prevents false emotion predictions on background noise.
Abstract
Speech emotion recognition (SER) often experiences reduced performance due to background noise. In addition, making a prediction on signals with only background noise could undermine user trust in the system. In this study, we propose a Noise Robust Speech Emotion Recognition system, NRSER. NRSER employs speech enhancement (SE) to effectively reduce the noise in input signals. Then, the signal-to-noise-ratio (SNR)-level detection structure and waveform reconstitution strategy are introduced to reduce the negative impact of SE on speech signals with no or little background noise. Our experimental results show that NRSER can effectively improve the noise robustness of the SER system, including preventing the system from making emotion recognition on signals consisting solely of background noise. Moreover, the proposed SNR-level detection structure can be used individually for tasks such…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Emotion and Mood Recognition
