Classification of Emotions and Evaluation of Customer Satisfaction from Speech in Real World Acoustic Environments
Luis Felipe Parra-Gallego, Juan Rafael Orozco-Arroyave

TL;DR
This study compares various speech features for emotion recognition and customer satisfaction evaluation in real-world acoustic environments, highlighting the effectiveness of articulation features in uncontrolled settings.
Contribution
Introduces and evaluates phonation, articulation, and prosody features for robust emotion and satisfaction classification in real-world scenarios, emphasizing practical industrial applications.
Findings
I2010PC feature set best for standard corpora
Articulation features outperform others in call-center recordings
Proposed features are suitable for uncontrolled acoustic conditions
Abstract
This paper focuses on finding suitable features to robustly recognize emotions and evaluate customer satisfaction from speech in real acoustic scenarios. The classification of emotions is based on standard and well-known corpora and the evaluation of customer satisfaction is based on recordings of real opinions given by customers about the received service during phone calls with call-center agents. The feature sets considered in this study include two speaker models, namely x-vectors and i-vectors, and also the well known feature set introduced in the Interspeech 2010 Paralinguistics Challenge (I2010PC). Additionally, we introduce the use of phonation, articulation and prosody features extracted with the DisVoice framework as alternative feature sets to robustly model emotions and customer satisfaction from speech. The results indicate that the I2010PC feature set is the best approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methodstravel james
