Multimodal Video Emotion Recognition with Reliable Reasoning Priors

Zhepeng Wang; Yingjian Zhu; Guanghao Dong; Hongzhu Yi; Feng Chen; Xinming Wang; Jun Xie

arXiv:2508.03722·cs.CV·August 7, 2025

Multimodal Video Emotion Recognition with Reliable Reasoning Priors

Zhepeng Wang, Yingjian Zhu, Guanghao Dong, Hongzhu Yi, Feng Chen, Xinming Wang, Jun Xie

PDF

TL;DR

This paper presents a multimodal emotion recognition framework that leverages trustworthy reasoning priors from MLLMs and introduces a balanced contrastive learning approach, achieving significant improvements on the MER2024 benchmark.

Contribution

It introduces a novel method to incorporate MLLM-derived reasoning priors into multimodal emotion recognition and proposes a balanced contrastive loss to address class imbalance.

Findings

01

Significant performance gains on MER2024 benchmark

02

Effective integration of reasoning priors enhances cross-modal fusion

03

Balanced dual-contrastive learning improves class distribution handling

Abstract

This study investigates the integration of trustworthy prior reasoning knowledge from MLLMs into multimodal emotion recognition. We employ Gemini to generate fine-grained, modality-separable reasoning traces, which are injected as priors during the fusion stage to enrich cross-modal interactions. To mitigate the pronounced class-imbalance in multimodal emotion recognition, we introduce Balanced Dual-Contrastive Learning, a loss formulation that jointly balances inter-class and intra-class distributions. Applied to the MER2024 benchmark, our prior-enhanced framework yields substantial performance gains, demonstrating that the reliability of MLLM-derived reasoning can be synergistically combined with the domain adaptability of lightweight fusion networks for robust, scalable emotion recognition.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.