RAMer: Reconstruction-based Adversarial Model for Multi-party Multi-modal Multi-label Emotion Recognition
Xudong Yang, Yizhang Zhu, Hanfeng Liu, Zeyi Wen, Nan Tang, Yuyu Luo

TL;DR
RAMer is a novel model that enhances multi-modal emotion recognition in multi-party settings by reconstructing features, leveraging contrastive learning, and using auxiliary tasks to handle incomplete data and improve inter-modal and label correlations.
Contribution
It introduces a reconstruction-based adversarial framework with contrastive learning and auxiliary tasks to address modality incompleteness and interdependency in multi-party multi-modal emotion recognition.
Findings
Achieves state-of-the-art results on MEmoR, CMU-MOSEI, and M3ED benchmarks.
Effectively handles incomplete modalities in multi-party scenarios.
Improves emotion recognition accuracy through novel feature reconstruction and correlation strategies.
Abstract
Conventional Multi-modal multi-label emotion recognition (MMER) assumes complete access to visual, textual, and acoustic modalities. However, real-world multi-party settings often violate this assumption, as non-speakers frequently lack acoustic and textual inputs, leading to a significant degradation in model performance. Existing approaches also tend to unify heterogeneous modalities into a single representation, overlooking each modality's unique characteristics. To address these challenges, we propose RAMer (Reconstruction-based Adversarial Model for Emotion Recognition), which refines multi-modal representations by not only exploring modality commonality and specificity but crucially by leveraging reconstructed features, enhanced by contrastive learning, to overcome data incompleteness and enrich feature quality. RAMer also introduces a personality auxiliary task to complement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining
