Probing Perceptual Constancy in Large Vision-Language Models

Haoran Sun; Bingyang Wang; Suyang Yu; Yijiang Li; Qingying Gao; Haiyun Lyu; Lianyu Huang; Zelong Hong; Jiahui Ge; Qianli Ma; Hang He; Yifan Zhou; Lingzi Guo; Lantao Mei; Maijunxian Wang; Dezhi Luo; Hokin Deng

arXiv:2502.10273·cs.CV·February 9, 2026

Probing Perceptual Constancy in Large Vision-Language Models

Haoran Sun, Bingyang Wang, Suyang Yu, Yijiang Li, Qingying Gao, Haiyun Lyu, Lianyu Huang, Zelong Hong, Jiahui Ge, Qianli Ma, Hang He, Yifan Zhou, Lingzi Guo, Lantao Mei, Maijunxian Wang, Dezhi Luo, Hokin Deng

PDF

Open Access 1 Datasets

TL;DR

This study evaluates 155 vision-language models on their ability to maintain perceptual constancy across color, size, and shape, revealing significant variability and dissociations in their perceptual stability.

Contribution

It provides a comprehensive assessment of perceptual constancy in large vision-language models using diverse experiments and introduces new in-the-wild tasks.

Findings

01

Shape constancy performance differs from color and size.

02

Performance varies significantly across models and domains.

03

In-the-wild tasks reveal real-world perceptual challenges.

Abstract

Perceptual constancy is the ability to maintain stable perceptions of objects despite changes in sensory input, such as variations in distance, angle, or lighting. This ability is crucial for visual understanding in a dynamic world. Here, we explored such ability in current Vision Language Models (VLMs). In this study, we evaluated 155 VLMs using 236 experiments across three domains: color, size, and shape constancy. The experiments included single-image and video adaptations of classic cognitive tasks, along with novel tasks in in-the-wild conditions. We found significant variability in VLM performance across these domains, with model performance in shape constancy clearly dissociated from that of color and size constancy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

grow-ai-like-a-child/perceptual-constancy
dataset· 23 dl
23 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Categorization, perception, and language · Advanced Image and Video Retrieval Techniques