MIND Your Reasoning: A Meta-Cognitive Intuitive-Reflective Network for Dual-Reasoning in Multimodal Stance Detection

Bingbing Wang; Zhengda Jin; Bin Liang; Wenjie Li; Jing Li; Ruifeng Xu; Min Zhang

arXiv:2511.06057·cs.CL·January 6, 2026

MIND Your Reasoning: A Meta-Cognitive Intuitive-Reflective Network for Dual-Reasoning in Multimodal Stance Detection

Bingbing Wang, Zhengda Jin, Bin Liang, Wenjie Li, Jing Li, Ruifeng Xu, Min Zhang

PDF

Open Access

TL;DR

This paper introduces MIND, a dual-reasoning network inspired by human cognition, which improves multimodal stance detection by explicitly reasoning about inter-modal dynamics rather than just fusing modalities.

Contribution

MIND is the first model to incorporate a meta-cognitive dual-process reasoning framework for multimodal stance detection, enhancing interpretability and robustness.

Findings

01

MIND outperforms baseline models on MMSD benchmark

02

The dual-reasoning approach improves robustness and generalization

03

Meta-cognitive reasoning enhances stance detection accuracy

Abstract

Multimodal Stance Detection (MSD) is a crucial task for understanding public opinion on social media. Existing methods predominantly operate by learning to fuse modalities. They lack an explicit reasoning process to discern how inter-modal dynamics, such as irony or conflict, collectively shape the user's final stance, leading to frequent misjudgments. To address this, we advocate for a paradigm shift from *learning to fuse* to *learning to reason*. We introduce **MIND**, a **M**eta-cognitive **I**ntuitive-reflective **N**etwork for **D**ual-reasoning. Inspired by the dual-process theory of human cognition, MIND operationalizes a self-improving loop. It first generates a rapid, intuitive hypothesis by querying evolving Modality and Semantic Experience Pools. Subsequently, a meta-cognitive reflective stage uses Modality-CoT and Semantic-CoT to scrutinize this initial judgment, distill…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Speech and dialogue systems