End-to-End Speech Emotion Recognition: Challenges of Real-Life Emergency Call Centers Data Recordings
Th\'eo Deschamps-Berger (LISN, CNRS), Lori Lamel (LISN, CNRS),, Laurence Devillers (LISN, CNRS, SU)

TL;DR
This study evaluates an end-to-end deep learning model for speech emotion recognition in emergency call centers, comparing performance on a standard dataset and real-life recordings, highlighting challenges of complex emotions in real scenarios.
Contribution
It applies and tests a neural network architecture for speech emotion recognition on both a standard and a real-life emergency call dataset, revealing real-world challenges.
Findings
63% UA on IEMOCAP with 4 emotions
45.6% UA on CEMO with 4 emotions
76.9% UA with 2 emotions on CEMO
Abstract
Recognizing a speaker's emotion from their speech can be a key element in emergency call centers. End-to-end deep learning systems for speech emotion recognition now achieve equivalent or even better results than conventional machine learning approaches. In this paper, in order to validate the performance of our neural network architecture for emotion recognition from speech, we first trained and tested it on the widely used corpus accessible by the community, IEMOCAP. We then used the same architecture as the real life corpus, CEMO, composed of 440 dialogs (2h16m) from 485 speakers. The most frequent emotions expressed by callers in these real life emergency dialogues are fear, anger and positive emotions such as relief. In the IEMOCAP general topic conversations, the most frequent emotions are sadness, anger and happiness. Using the same end-to-end deep learning architecture, an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Speech Recognition and Synthesis · Sentiment Analysis and Opinion Mining
