Pseudocode-Guided Structured Reasoning for Automating Reliable Inference in Vision-Language Models

Weicong Ni; Tianbao Jiang; Linlin Wang

arXiv:2605.19663·cs.AI·May 20, 2026

Pseudocode-Guided Structured Reasoning for Automating Reliable Inference in Vision-Language Models

Weicong Ni, Tianbao Jiang, Linlin Wang

PDF

TL;DR

The paper introduces PStar, a pseudocode-guided structured reasoning framework that enhances the robustness and reliability of vision-language models by adaptively selecting reasoning paths based on question difficulty.

Contribution

It proposes a novel adaptive reasoning strategy using structured pseudocode and a difficulty assessment feature to reduce hallucinations in vision-language models.

Findings

01

Achieves 87.1% on POPE and 68.0% on MMStar benchmarks.

02

Significantly reduces hallucination rates compared to previous methods.

03

Outperforms GPT-4V in reasoning tasks.

Abstract

Vision-Language Models (VLMs) are becoming the cornerstone of high-level reasoning for robotic automation, enabling robots to parse natural language commands and perceive their environments. However, their susceptibility to hallucinations introduces critical failures in decision-making, posing significant safety and reliability risks in physical deployments. This challenge is exacerbated by the open-ended nature of real-world tasks, where questions vary vastly in difficulty and modality, demanding robust and adaptable reasoning strategies. To tackle this, we propose the Pseudocode-guided Structured Reasoning framework (PStar), which adaptively selects structured pseudocode reasoning paths to help VLMs perform flexible and step-by-step reasoning. We first design a set of abstract reasoning functions and formulate a structured pseudocode library to represent modular reasoning strategies.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.