Benchmarking and Bridging Emotion Conflicts for Multimodal Emotion Reasoning

Zhiyuan Han; Beier Zhu; Yanlong Xu; Peipei Song; Xun Yang

arXiv:2508.01181·cs.AI·October 14, 2025

Benchmarking and Bridging Emotion Conflicts for Multimodal Emotion Reasoning

Zhiyuan Han, Beier Zhu, Yanlong Xu, Peipei Song, Xun Yang

PDF

1 Models

TL;DR

This paper introduces CA-MER, a benchmark for emotion conflict scenarios in multimodal emotion reasoning, and proposes MoSEAR, a framework that balances modality contributions to improve emotion recognition accuracy.

Contribution

The paper presents CA-MER for evaluating emotion conflicts and proposes MoSEAR, a novel, parameter-efficient framework that mitigates modality bias in multimodal emotion reasoning.

Findings

01

MoSEAR reduces modality bias during emotion conflicts.

02

MoSEAR achieves state-of-the-art results on multiple benchmarks.

03

Balanced modality integration improves emotion recognition accuracy.

Abstract

Despite their strong performance in multimodal emotion reasoning, existing Multimodal Large Language Models (MLLMs) often overlook the scenarios involving emotion conflicts, where emotional cues from different modalities are inconsistent. To fill this gap, we first introduce CA-MER, a new benchmark designed to examine MLLMs under realistic emotion conflicts. It consists of three subsets: video-aligned, audio-aligned, and consistent, where only one or all modalities reflect the true emotion. However, evaluations on our CA-MER reveal that current state-of-the-art emotion MLLMs systematically over-rely on audio signal during emotion conflicts, neglecting critical cues from visual modality. To mitigate this bias, we propose MoSEAR, a parameter-efficient framework that promotes balanced modality integration. MoSEAR consists of two modules: (1)MoSE, modality-specific experts with a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
AaronHan/MoSEAR
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.