Bridging Speech Emotion Recognition and Personality: Dataset and Temporal Interaction Condition Network
Yuan Gao, Hao Shi, Yahui Fu, Chenhui Chu, Tatsuya Kawahara

TL;DR
This paper introduces a new dataset with both emotion and personality annotations, and a novel neural network model that leverages personality traits to improve speech emotion recognition accuracy.
Contribution
It provides the first dataset combining emotion and personality annotations and proposes a TICN model that integrates personality traits for enhanced SER performance.
Findings
Incorporating ground-truth personality traits improves valence recognition CCC from 0.698 to 0.785.
Automatically predicted personality traits enhance valence CCC to 0.776, an 11.17% improvement.
The approach confirms the effectiveness of personality-aware speech emotion recognition.
Abstract
This study investigates the interaction between personality traits and emotion expression, exploring how personality information can improve speech emotion recognition (SER). We collect the personality annotation for the IEMOCAP dataset, making it the first speech dataset that contains both emotion and personality annotations (PA-IEMOCAP), and enabling direct integration of personality traits into SER. Statistical analysis on this dataset identified significant correlations between personality traits and emotional expressions. To extract finegrained personality features, we propose a temporal interaction condition network (TICN), in which personality features are integrated with HuBERT-based acoustic features for SER. Experiments show that incorporating ground-truth personality traits significantly enhances valence recognition, improving the concordance correlation coefficient (CCC)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Speech Recognition and Synthesis
