Graph-based multi-Feature fusion method for speech emotion recognition
Xueyu Liu, Jie Lin, Chao Wang

TL;DR
This paper introduces a graph-based multi-feature fusion method for speech emotion recognition that explicitly models relationships between features, leading to improved cross-corpus performance and better emotion recognition accuracy.
Contribution
It proposes a novel graph-based fusion approach with multi-dimensional edge features to better capture feature interactions in speech emotion recognition.
Findings
Achieved 17.28% improvement in CCC scores for arousal on German data.
Demonstrated 13% performance improvement over existing fusion techniques.
Validated effectiveness on SEWA and AVEC 2019 datasets.
Abstract
Exploring proper way to conduct multi-speech feature fusion for cross-corpus speech emotion recognition is crucial as different speech features could provide complementary cues reflecting human emotion status. While most previous approaches only extract a single speech feature for emotion recognition, existing fusion methods such as concatenation, parallel connection, and splicing ignore heterogeneous patterns in the interaction between features and features, resulting in performance of existing systems. In this paper, we propose a novel graph-based fusion method to explicitly model the relationships between every pair of speech features. Specifically, we propose a multi-dimensional edge features learning strategy called Graph-based multi-Feature fusion method for speech emotion recognition. It represents each speech feature as a node and learns multi-dimensional edge features to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Computing and Algorithms
