VLMs have Tunnel Vision: Evaluating Nonlocal Visual Reasoning in Leading VLMs
Shmuel Berman, Jia Deng

TL;DR
This paper evaluates leading vision-language models' ability to perform nonlocal visual reasoning tasks, revealing they struggle with complex reasoning that humans perform easily, despite high performance on simpler benchmarks.
Contribution
The paper introduces a structured evaluation suite for nonlocal visual reasoning, highlighting the gap between current VLMs' visual acuity and core reasoning capabilities.
Findings
Leading VLMs perform barely above random on nonlocal reasoning tasks.
Current models lack core visual reasoning abilities despite high raw visual performance.
The evaluation suite isolates specific reasoning skills like comparison and search.
Abstract
Vision-Language Models (VLMs) excel at complex visual tasks such as VQA and chart understanding, yet recent work suggests they struggle with simple perceptual tests. We present an evaluation of vision-language models' capacity for nonlocal visual reasoning: reasoning that requires chaining evidence collected from multiple, possibly distant regions of an image. We isolate three distinct forms of nonlocal vision: comparative perception, which demands holding two images in working memory and comparing them; saccadic search, which requires making discrete, evidence-driven jumps to locate successive targets; and smooth visual search, which involves following a continuous contour. Flagship models (e.g., GPT-5, Gemini 2.5 Pro, Claude Sonnet 4), even those that perform well on prior primitive-vision benchmarks, fail these tests and barely exceed random accuracy on two variants of our tasks that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Tactile and Sensory Interactions
