Emotion-Coherent Reasoning for Multimodal LLMs via Emotional Rationale Verifier
Hyeongseop Rha, Jeong Hun Yeo, Yeonju Kim, Yong Man Ro

TL;DR
This paper introduces the Emotional Rationale Verifier, a method that improves the consistency and accuracy of emotion explanations in multimodal large language models, enhancing their emotional intelligence and trustworthiness.
Contribution
It proposes a novel verifier and explanation reward that ensure emotion explanations align with predictions without altering model architecture or needing extra annotations.
Findings
Significantly improves explanation-prediction consistency.
Enhances emotion explanation accuracy on benchmark datasets.
Empowers models to deliver emotionally coherent and trustworthy interactions.
Abstract
The recent advancement of Multimodal Large Language Models (MLLMs) is transforming human-computer interaction (HCI) from surface-level exchanges into more nuanced and emotionally intelligent communication. To realize this shift, emotion understanding becomes essential allowing systems to capture subtle cues underlying user intent. Furthermore, providing faithful explanations for predicted emotions is crucial to ensure interpretability and build user trust. However, current MLLM-based methods often generate emotion explanations that diverge from the target labels and sometimes even contradict their own predicted emotions. This inconsistency poses a critical risk for misunderstanding and erodes reliability in interactive settings. To address this, we propose a novel approach: the Emotional Rationale Verifier (ERV) and an Explanation Reward. Our method guides the model to produce reasoning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
