TL;DR
This paper introduces DGDA, a novel framework for cross-scenario multimodal emotion recognition that effectively handles domain shifts and noisy labels using graph-based models and adversarial training.
Contribution
DGDA is the first MERC framework to jointly address domain adaptation and label noise with a dual-branch graph neural network and adversarial learning.
Findings
DGDA outperforms strong baselines on IEMOCAP and MELD datasets.
The framework achieves better cross-scenario adaptation in multimodal emotion recognition.
Theoretical analysis confirms tighter generalization bounds for DGDA.
Abstract
Multimodal Emotion Recognition in Conversations (MERC) aims to predict speakers' emotional states in multi-turn dialogues through text, audio, and visual cues. In real-world settings, conversation scenarios differ significantly in speakers, topics, styles, and noise levels. Existing MERC methods generally neglect these cross-scenario variations, limiting their ability to transfer models trained on a source domain to unseen target domains. To address this issue, we propose a Dual-branch Graph Domain Adaptation framework (DGDA) for multimodal emotion recognition under cross-scenario conditions. We first construct an emotion interaction graph to characterize complex emotional dependencies among utterances. A dual-branch encoder, consisting of a hypergraph neural network (HGNN) and a path neural network (PathNN), is then designed to explicitly model multivariate relationships and implicitly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
