Learning What to Attend First: Modality-Importance-Guided Reasoning for Reliable Multimodal Emotion Understanding

Hyeongseop Rha; Jeong Hun Yeo; Junil Won; Se Jin Park; Yong Man Ro

arXiv:2512.02699·cs.AI·December 3, 2025

Learning What to Attend First: Modality-Importance-Guided Reasoning for Reliable Multimodal Emotion Understanding

Hyeongseop Rha, Jeong Hun Yeo, Junil Won, Se Jin Park, Yong Man Ro

PDF

Open Access

TL;DR

This paper introduces MIGR, a framework that enhances multimodal emotion understanding by guiding reasoning to start from the most relevant modality, significantly improving explanation reliability and emotional consistency.

Contribution

The paper proposes Modality-Importance-Guided Reasoning (MIGR), a novel approach that reorganizes reasoning sequences based on modality importance to improve reliability in multimodal emotion understanding.

Findings

01

Reduces emotionally inconsistent explanations from 18.10% to 7.37%.

02

Improves reasoning reliability in multimodal emotion tasks.

03

Validates effectiveness on the DFEW benchmark.

Abstract

In this paper, we present Modality-Importance-Guided Reasoning (MIGR), a framework designed to improve the reliability of reasoning-based multimodal emotion understanding in multimodal large language models. Although existing methods have advanced emotion understanding, they often suffer from reasoning drift: models gradually rely on their own generated text instead of multimodal evidence, and their explanations are overly shaped by visually initiated reasoning paths. To address these issues, we introduce Modality Importance (MI), a simple yet effective mechanism for identifying the emotion-dominant modality. Using MI, MIGR reorganizes reasoning sequences so that explanations begin from the modality most critical to the target emotion, preventing early reasoning from being misled by less informative cues. Our two-stage framework-comprising modality-aligned supervised fine-tuning and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Topic Modeling