MIND: Multi-rationale INtegrated Discriminative Reasoning Framework for Multi-modal Large Models
Chuang Yu, Jinmiao Zhao, Mingxuan Zhao, Yunpeng Liu, Xiujun Shu, Yuanhao Feng, Bo Wang, Xiangyu Yue

TL;DR
The paper introduces MIND, a novel reasoning framework for multimodal large language models that enhances multi-rationale semantic understanding, logical robustness, and active correction capabilities, leading to state-of-the-art results.
Contribution
It presents a comprehensive framework combining rationale augmentation, progressive correction, and contrastive alignment to improve reasoning in multimodal large models.
Findings
Achieves SOTA performance on multiple reasoning datasets.
Enhances logical robustness and multi-rationale semantic modeling.
Provides a new paradigm for active discriminative reasoning in MLLMs.
Abstract
Recently, multimodal large language models (MLLMs) have been widely applied to reasoning tasks. However, they suffer from limited multi-rationale semantic modeling, insufficient logical robustness, and are susceptible to misleading interpretations in complex scenarios. Therefore, we propose a Multi-rationale INtegrated Discriminative (MIND) reasoning framework, which is designed to endow MLLMs with human-like cognitive abilities of "Understand -> Rethink -> Correct", and achieves a paradigm evolution from passive imitation-based reasoning to active discriminative reasoning. Specifically, we introduce a Rationale Augmentation and Discrimination (RAD) paradigm, which automatically and efficiently expands existing datasets by generating diverse rationales, providing a unified and extensible data foundation. Meanwhile, we design a Progressive Two-stage Correction Learning (P2CL) strategy.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Advanced Graph Neural Networks
