MIAR: Modality Interaction and Alignment Representation Fuison for Multimodal Emotion
Jichao Zhu, Jun Yu

TL;DR
The paper introduces MIAR, a novel multimodal emotion recognition method that effectively integrates and aligns language, vision, and audio modalities, improving performance and generalization across diverse datasets.
Contribution
MIAR uniquely combines feature interaction and contrastive alignment to better handle modality differences and enhance multimodal emotion recognition accuracy.
Findings
MIAR outperforms existing methods on CMU-MOSI and CMU-MOSEI datasets.
The approach improves cross-modal alignment and global feature representation.
Experimental results demonstrate superior performance over state-of-the-art techniques.
Abstract
Multimodal Emotion Recognition (MER) aims to perceive human emotions through three modes: language, vision, and audio. Previous methods primarily focused on modal fusion without adequately addressing significant distributional differences among modalities or considering their varying contributions to the task. They also lacked robust generalization capabilities across diverse textual model features, thus limiting performance in multimodal scenarios. Therefore, we propose a novel approach called Modality Interaction and Alignment Representation (MIAR). This network integrates contextual features across different modalities using a feature interaction to generate feature tokens to represent global representations of this modality extracting information from other modalities. These four tokens represent global representations of how each modality extracts information from others. MIAR…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining · Multimodal Machine Learning Applications
