When Visual Evidence is Ambiguous: Pareidolia as a Diagnostic Probe for Vision Models

Qianpu Chen; Derya Soydaner; Rob Saunders

arXiv:2603.03989·cs.CV·March 5, 2026

When Visual Evidence is Ambiguous: Pareidolia as a Diagnostic Probe for Vision Models

Qianpu Chen, Derya Soydaner, Rob Saunders

PDF

Open Access

TL;DR

This paper investigates how different vision models interpret ambiguous face-like patterns, revealing distinct mechanisms of interpretation and biases, and proposes a diagnostic framework to analyze their behavior under ambiguity.

Contribution

It introduces a unified diagnostic framework for analyzing vision models' responses to ambiguous face pareidolia, revealing model-specific mechanisms and biases.

Findings

01

VLMs show semantic overactivation, overcalling face-like patterns.

02

ViT employs uncertainty-based abstention, remaining unbiased.

03

Detection models suppress pareidolia through conservative priors.

Abstract

When visual evidence is ambiguous, vision models must decide whether to interpret face-like patterns as meaningful. Face pareidolia, the perception of faces in non-face objects, provides a controlled probe of this behavior. We introduce a representation-level diagnostic framework that analyzes detection, localization, uncertainty, and bias across class, difficulty, and emotion in face pareidolia images. Under a unified protocol, we evaluate six models spanning four representational regimes: vision-language models (VLMs; CLIP-B/32, CLIP-L/14, LLaVA-1.5-7B), pure vision classification (ViT), general object detection (YOLOv8), and face detection (RetinaFace). Our analysis reveals three mechanisms of interpretation under ambiguity. VLMs exhibit semantic overactivation, systematically pulling ambiguous non-human regions toward the Human concept, with LLaVA-1.5-7B producing the strongest and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace Recognition and Perception · Face recognition and analysis · Visual Attention and Saliency Detection