Color-based Emotion Representation for Speech Emotion Recognition
Ryotaro Nagase, Ryoichi Takashima, Yoichi Yamashita

TL;DR
This paper introduces a novel approach to speech emotion recognition by using color attributes as continuous, interpretable representations of emotions, leveraging crowdsourced annotations and machine learning models.
Contribution
It proposes using color attributes to represent emotions in speech, and develops regression models and multitask learning techniques to improve SER performance.
Findings
Color attributes correlate with speech emotions.
Multitask learning enhances emotion recognition accuracy.
Color-based models outperform traditional categorical methods.
Abstract
Speech emotion recognition (SER) has traditionally relied on categorical or dimensional labels. However, this technique is limited in representing both the diversity and interpretability of emotions. To overcome this limitation, we focus on color attributes, such as hue, saturation, and value, to represent emotions as continuous and interpretable scores. We annotated an emotional speech corpus with color attributes via crowdsourcing and analyzed them. Moreover, we built regression models for color attributes in SER using machine learning and deep learning, and explored the multitask learning of color attribute regression and emotion classification. As a result, we demonstrated the relationship between color attributes and emotions in speech, and successfully developed color attribute regression models for SER. We also showed that multitask learning improved the performance of each task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Color perception and design · Sentiment Analysis and Opinion Mining
