Calibrating Multimodal Consensus for Emotion Recognition

Guowei Zhong; Junjie Li; Huaiyu Zhu; Ruohong Huan; Yun Pan

arXiv:2510.20256·cs.CV·October 24, 2025

Calibrating Multimodal Consensus for Emotion Recognition

Guowei Zhong, Junjie Li, Huaiyu Zhu, Ruohong Huan, Yun Pan

PDF

Open Access

TL;DR

This paper introduces Calibrated Multimodal Consensus (CMC), a novel model for emotion recognition that addresses semantic inconsistencies and modality dominance issues, achieving superior performance across multiple datasets.

Contribution

The paper proposes CMC, which uses pseudo unimodal labels and a consensus-guided fusion process to improve multimodal emotion recognition accuracy.

Findings

01

CMC outperforms state-of-the-art methods on four datasets.

02

It shows robustness in scenarios with semantic inconsistencies.

03

The approach mitigates text modality dominance.

Abstract

In recent years, Multimodal Emotion Recognition (MER) has made substantial progress. Nevertheless, most existing approaches neglect the semantic inconsistencies that may arise across modalities, such as conflicting emotional cues between text and visual inputs. Besides, current methods are often dominated by the text modality due to its strong representational capacity, which can compromise recognition accuracy. To address these challenges, we propose a model termed Calibrated Multimodal Consensus (CMC). CMC introduces a Pseudo Label Generation Module (PLGM) to produce pseudo unimodal labels, enabling unimodal pretraining in a self-supervised fashion. It then employs a Parameter-free Fusion Module (PFM) and a Multimodal Consensus Router (MCR) for multimodal finetuning, thereby mitigating text dominance and guiding the fusion process toward a more reliable consensus. Experimental results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining · Multimodal Machine Learning Applications