A breakthrough in Speech emotion recognition using Deep Retinal Convolution Neural Networks
Yafeng Niu, Dongsheng Zou, Yadong Niu, Zhongshi He, Hua Tan

TL;DR
This paper introduces a novel data augmentation method inspired by retinal imaging principles and a Deep Retinal Convolution Neural Network (DRCNN) that significantly improves speech emotion recognition accuracy, surpassing previous methods.
Contribution
The paper presents a new data augmentation technique based on retinal imaging principles and a specialized deep learning model, DRCNN, for enhanced speech emotion recognition.
Findings
Achieved over 99% average accuracy in SER
Outperformed previous methods in recognizing multiple emotions
Demonstrated effectiveness of retinal-inspired data augmentation
Abstract
Speech emotion recognition (SER) is to study the formation and change of speaker's emotional state from the speech signal perspective, so as to make the interaction between human and computer more intelligent. SER is a challenging task that has encountered the problem of less training data and low prediction accuracy. Here we propose a data augmentation algorithm based on the imaging principle of the retina and convex lens, to acquire the different sizes of spectrogram and increase the amount of training data by changing the distance between the spectrogram and the convex lens. Meanwhile, with the help of deep learning to get the high-level features, we propose the Deep Retinal Convolution Neural Networks (DRCNNs) for SER and achieve the average accuracy over 99%. The experimental results indicate that DRCNNs outperforms the previous studies in terms of both the number of emotions and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Gaze Tracking and Assistive Technology · EEG and Brain-Computer Interfaces
