Look Before You Decide: Prompting Active Deduction of MLLMs for   Assumptive Reasoning

Yian Li; Wentao Tian; Yang Jiao; Jingjing Chen; Tianwen Qian; Bin Zhu,; Na Zhao; Yu-Gang Jiang

arXiv:2404.12966·cs.CV·April 18, 2025·1 cites

Look Before You Decide: Prompting Active Deduction of MLLMs for Assumptive Reasoning

Yian Li, Wentao Tian, Yang Jiao, Jingjing Chen, Tianwen Qian, Bin Zhu,, Na Zhao, Yu-Gang Jiang

PDF

Open Access

TL;DR

This paper investigates the reasoning capabilities of Multimodal Large Language Models (MLLMs), introduces a new benchmark for assumptive reasoning, and proposes an active deduction method to enhance their reasoning skills without affecting general performance.

Contribution

The paper presents MARS-Bench for evaluating assumptive reasoning in MLLMs and introduces Active Deduction, a reinforcement learning approach to improve their reasoning abilities.

Findings

01

Most MLLMs are easily fooled by naive presuppositions.

02

Active Deduction significantly improves MLLMs' assumptive reasoning.

03

The method maintains overall question-answering performance.

Abstract

Recently, Multimodal Large Language Models (MLLMs) have achieved significant success across multiple disciplines due to their exceptional instruction-following capabilities and extensive world knowledge. However, whether these MLLMs possess human-like compositional reasoning abilities remains an open problem. To unveil their reasoning behaviors, we first curate a \textbf{M}ultimodal \textbf{A}ssumptive \textbf{R}ea\textbf{s}oning Benchmark (MARS-Bench) in this paper. Interestingly, we find that most prevalent MLLMs can be easily fooled by the introduction of a presupposition into the question, whereas such presuppositions appear naive to human reasoning. Besides, we also propose a simple yet effective method, Active Deduction (AD), a novel reinforcement learning paradigm to encourage the model to actively perform composite deduction before reaching a final decision. Equipped with the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques