Adaptive Speech Emotion Representation Learning Based On Dynamic Graph
Yingxue Gao, Huan Zhao, Zixing Zhang

TL;DR
This paper introduces an adaptive graph learning method for speech emotion recognition that dynamically evolves graph structures to better capture local and global context in sequential speech data, improving recognition accuracy.
Contribution
It proposes a novel dynamic graph construction and node updating approach, along with a learnable graph convolutional layer, enhancing speech emotion recognition performance.
Findings
Outperforms existing models on IEMOCAP and RAVDESS datasets
Effectively captures local and global context in speech sequences
Demonstrates improved emotion recognition accuracy
Abstract
Graph representation learning has become a hot research topic due to its powerful nonlinear fitting capability in extracting representative node embeddings. However, for sequential data such as speech signals, most traditional methods merely focus on the static graph created within a sequence, and largely overlook the intrinsic evolving patterns of these data. This may reduce the efficiency of graph representation learning for sequential data. For this reason, we propose an adaptive graph representation learning method based on dynamically evolved graphs, which are consecutively constructed on a series of subsequences segmented by a sliding window. In doing this, it is better to capture local and global context information within a long sequence. Moreover, we introduce a weighted approach to update the node representation rather than the conventional average one, where the weights are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Text and Document Classification Technologies
