AHD ConvNet for Speech Emotion Classification
Asfand Ali, Danial Nasir, Mohammad Hassan Jawad

TL;DR
This paper introduces a novel speech emotion recognition model using mel spectrograms and a ConvNet, achieving faster training times on the CREMA-D dataset compared to existing methods.
Contribution
The work proposes a new mel spectrogram learning approach with a ConvNet for speech emotion classification, emphasizing reduced training time.
Findings
Effective emotion recognition from speech using mel spectrograms
Faster training times compared to previous methods
Successful application on CREMA-D dataset
Abstract
Accomplishments in the field of artificial intelligence are utilized in the advancement of computing and making of intelligent machines for facilitating mankind and improving user experience. Emotions are rudimentary for people, affecting thinking and ordinary exercises like correspondence, learning and direction. Speech emotion recognition is domain of interest in this regard and in this work, we propose a novel mel spectrogram learning approach in which our model uses the datapoints to learn emotions from the given wav form voice notes in the popular CREMA-D dataset. Our model uses log mel-spectrogram as feature with number of mels = 64. It took less training time compared to other approaches used to address the problem of emotion speech recognition.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Emotion and Mood Recognition · Anomaly Detection Techniques and Applications
