Masked Graph Learning with Recurrent Alignment for Multimodal Emotion Recognition in Conversation
Tao Meng, Fuchen Zhang, Yuntao Shou, Hongen Shao, Wei Ai, Keqin Li

TL;DR
This paper introduces MGLRA, a novel multimodal emotion recognition method that uses recurrent alignment and masked graph convolution to improve feature fusion and recognition accuracy across multiple modalities.
Contribution
The paper proposes a new approach combining recurrent iterative alignment and masked GCN for better multimodal feature fusion in emotion recognition.
Findings
Outperforms state-of-the-art methods on IEMOCAP and MELD datasets.
Effectively aligns multimodal features using recurrent and attention mechanisms.
Reduces intra-modal noise and improves emotion recognition accuracy.
Abstract
Since Multimodal Emotion Recognition in Conversation (MERC) can be applied to public opinion monitoring, intelligent dialogue robots, and other fields, it has received extensive research attention in recent years. Unlike traditional unimodal emotion recognition, MERC can fuse complementary semantic information between multiple modalities (e.g., text, audio, and vision) to improve emotion recognition. However, previous work ignored the inter-modal alignment process and the intra-modal noise information before multimodal fusion but directly fuses multimodal features, which will hinder the model for representation learning. In this study, we have developed a novel approach called Masked Graph Learning with Recursive Alignment (MGLRA) to tackle this problem, which uses a recurrent iterative module with memory to align multimodal features, and then uses the masked GCN for multimodal feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSigmoid Activation · Linear Layer · Tanh Activation · Softmax · Long Short-Term Memory · Attention Is All You Need · Multi-Head Attention · ALIGN · Graph Convolutional Network
