Constructive Apraxia: An Unexpected Limit of Instructible Vision-Language Models and Analog for Human Cognitive Disorders
David Noever, Samantha E. Miller Noever

TL;DR
This paper uncovers a surprising similarity between vision-language models' inability to perform spatial tasks and human constructive apraxia, revealing a fundamental limitation in current AI models' spatial reasoning skills.
Contribution
It demonstrates that state-of-the-art VLMs fail at basic spatial reasoning tasks like the Ponzo illusion, paralleling human cognitive deficits and suggesting new directions for AI development.
Findings
24 out of 25 models failed to correctly render the Ponzo illusion
Models misinterpreted spatial instructions, producing tilted or misaligned lines
Behavior mirrors deficits seen in patients with constructive apraxia
Abstract
This study reveals an unexpected parallel between instructible vision-language models (VLMs) and human cognitive disorders, specifically constructive apraxia. We tested 25 state-of-the-art VLMs, including GPT-4 Vision, DALL-E 3, and Midjourney v5, on their ability to generate images of the Ponzo illusion, a task that requires basic spatial reasoning and is often used in clinical assessments of constructive apraxia. Remarkably, 24 out of 25 models failed to correctly render two horizontal lines against a perspective background, mirroring the deficits seen in patients with parietal lobe damage. The models consistently misinterpreted spatial instructions, producing tilted or misaligned lines that followed the perspective of the background rather than remaining horizontal. This behavior is strikingly similar to how apraxia patients struggle to copy or construct simple figures despite intact…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeurobiology of Language and Bilingualism
