Disentangled Dual-Branch Graph Learning for Conversational Emotion Recognition

Chengling Guo; Yuntao Shou; Tao Meng; Wei Ai; Yun Tan; and Keqin Li

arXiv:2604.14204·cs.SD·April 17, 2026

Disentangled Dual-Branch Graph Learning for Conversational Emotion Recognition

Chengling Guo, Yuntao Shou, Tao Meng, Wei Ai, Yun Tan, and Keqin Li

PDF

TL;DR

This paper introduces a novel multimodal emotion recognition framework that disentangles features and models high-order speaker interactions using graph neural networks, improving accuracy on benchmark datasets.

Contribution

It proposes a dual-branch graph learning approach with feature disentanglement and speaker-aware hypergraphs, addressing key challenges in multimodal conversational emotion recognition.

Findings

01

Achieves superior performance on IEMOCAP and MELD datasets.

02

Effectively separates modality-invariant and modality-specific features.

03

Models high-order speaker interactions with hypergraph neural networks.

Abstract

Multimodal emotion recognition in conversations aims to infer utterance-level emotions by jointly modeling textual, acoustic, and visual cues within context. Despite recent progress, key challenges remain, including redundant cross-modal information, imperfect semantic alignment, and insufficient modeling of high-order speaker interactions. To address these issues, we propose a framework that combines dual-space feature disentanglement with dual-branch graph learning. A shared encoder and modality-specific encoders are used to separate modality-invariant and modality-specific representations. The invariant features are modeled by a Fourier graph neural network to capture global consistency and complementary patterns, with a frequency-domain contrastive objective to enhance discriminability. In parallel, a speaker-aware hypergraph is constructed over modality-specific features to model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.