Latent Distribution Decoupling: A Probabilistic Framework for Uncertainty-Aware Multimodal Emotion Recognition
Jingwang Huang, Jiang Zhong, Qin Lei, Jinpeng Gao, Yuming Yang, Sirui, Wang, Peiguang Li, Kaiwen Wei

TL;DR
This paper introduces LDDU, a probabilistic framework for multimodal emotion recognition that models aleatoric uncertainty in the latent emotional space, leading to improved fusion and state-of-the-art results.
Contribution
The paper proposes a novel latent distribution decomposition framework that explicitly models uncertainty in multimodal emotion recognition.
Findings
Achieves state-of-the-art performance on CMU-MOSEI and M3ED datasets.
Effectively models aleatoric uncertainty to improve modality fusion.
Demonstrates the importance of uncertainty modeling in MMER.
Abstract
Multimodal multi-label emotion recognition (MMER) aims to identify the concurrent presence of multiple emotions in multimodal data. Existing studies primarily focus on improving fusion strategies and modeling modality-to-label dependencies. However, they often overlook the impact of \textbf{aleatoric uncertainty}, which is the inherent noise in the multimodal data and hinders the effectiveness of modality fusion by introducing ambiguity into feature representations. To address this issue and effectively model aleatoric uncertainty, this paper proposes Latent emotional Distribution Decomposition with Uncertainty perception (LDDU) framework from a novel perspective of latent emotional space probabilistic modeling. Specifically, we introduce a contrastive disentangled distribution mechanism within the emotion space to model the multimodal data, allowing for the extraction of semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Speech Recognition and Synthesis · Anomaly Detection Techniques and Applications
MethodsFocus
