DER-GCN: Dialogue and Event Relation-Aware Graph Convolutional Neural Network for Multimodal Dialogue Emotion Recognition
Wei Ai, Yuntao Shou, Tao Meng, Nan Yin, and Keqin Li

TL;DR
This paper introduces DER-GCN, a novel graph neural network that models dialogue and event relations for multimodal emotion recognition, improving accuracy by capturing complex dependencies in dialogue data.
Contribution
The paper proposes a new graph-based model incorporating event relations and a self-supervised autoencoder to enhance multimodal emotion recognition.
Findings
Significant improvement in emotion recognition accuracy on IEMOCAP and MELD datasets.
Effective modeling of dialogue and event relations enhances emotion detection.
The approach outperforms existing methods in both accuracy and F1 score.
Abstract
With the continuous development of deep learning (DL), the task of multimodal dialogue emotion recognition (MDER) has recently received extensive research attention, which is also an essential branch of DL. The MDER aims to identify the emotional information contained in different modalities, e.g., text, video, and audio, in different dialogue scenes. However, existing research has focused on modeling contextual semantic information and dialogue relations between speakers while ignoring the impact of event relations on emotion. To tackle the above issues, we propose a novel Dialogue and Event Relation-Aware Graph Convolutional Neural Network for Multimodal Emotion Recognition (DER-GCN) method. It models dialogue relations between speakers and captures latent event relations information. Specifically, we construct a weighted multi-relationship graph to simultaneously capture the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Emotion and Mood Recognition · Speech and dialogue systems
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Position-Wise Feed-Forward Layer · Contrastive Learning · Absolute Position Encodings · Label Smoothing · Adam · Byte Pair Encoding
