Socratic Questioning: Learn to Self-guide Multimodal Reasoning in the Wild
Wanpeng Hu, Haodi Liu, Lin Chen, Feng Zhou, Changming Xiao, Qi Yang,, Changshui Zhang

TL;DR
This paper introduces Socratic Questioning, a novel multi-round training framework for lightweight multimodal models that improves visual reasoning, reduces hallucinations, and enhances fine-grained image understanding through self-guided questioning.
Contribution
The paper proposes a new self-questioning framework called Socratic Questioning for multimodal models, combining reasoning and instruction tuning to improve visual reasoning and hallucination mitigation.
Findings
31.2% reduction in hallucination score
Significant improvement in zero-shot visual reasoning
Effective in complex visual question-answering tasks
Abstract
Complex visual reasoning remains a key challenge today. Typically, the challenge is tackled using methodologies such as Chain of Thought (COT) and visual instruction tuning. However, how to organically combine these two methodologies for greater success remains unexplored. Also, issues like hallucinations and high training cost still need to be addressed. In this work, we devise an innovative multi-round training and reasoning framework suitable for lightweight Multimodal Large Language Models (MLLMs). Our self-questioning approach heuristically guides MLLMs to focus on visual clues relevant to the target problem, reducing hallucinations and enhancing the model's ability to describe fine-grained image details. This ultimately enables the model to perform well in complex visual reasoning and question-answering tasks. We have named this framework Socratic Questioning(SQ). To facilitate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducation and Critical Thinking Development
MethodsFocus
