RoHyDR: Robust Hybrid Diffusion Recovery for Incomplete Multimodal Emotion Recognition
Yuehan Jin, Xiaoqing Liu, Yiyuan Yang, Zhiwen Yu, Tong Zhang, Kaixiang Yang

TL;DR
RoHyDR introduces a robust framework combining diffusion models and adversarial learning to recover missing data in multimodal emotion recognition, significantly improving performance under incomplete data scenarios.
Contribution
The paper presents a novel hybrid diffusion and adversarial learning approach for comprehensive missing-modality recovery in emotion recognition.
Findings
Outperforms state-of-the-art IMER methods on benchmark datasets.
Effectively recovers missing information at multiple representation levels.
Maintains high recognition accuracy despite various missing-modality scenarios.
Abstract
Multimodal emotion recognition analyzes emotions by combining data from multiple sources. However, real-world noise or sensor failures often cause missing or corrupted data, creating the Incomplete Multimodal Emotion Recognition (IMER) challenge. In this paper, we propose Robust Hybrid Diffusion Recovery (RoHyDR), a novel framework that performs missing-modality recovery at unimodal, multimodal, feature, and semantic levels. For unimodal representation recovery of missing modalities, RoHyDR exploits a diffusion-based generator to generate distribution-consistent and semantically aligned representations from Gaussian noise, using available modalities as conditioning. For multimodal fusion recovery, we introduce adversarial learning to produce a realistic fused multimodal representation and recover missing semantic content. We further propose a multi-stage optimization strategy that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
