Evaluating Graphical Perception Capabilities of Vision Transformers

Poonam Poonam; Pere-Pau V\'azquez; Timo Ropinski

arXiv:2602.18178·cs.CV·February 23, 2026

Evaluating Graphical Perception Capabilities of Vision Transformers

Poonam Poonam, Pere-Pau V\'azquez, Timo Ropinski

PDF

Open Access

TL;DR

This paper evaluates the graphical perception capabilities of Vision Transformers (ViTs) by benchmarking them against CNNs and humans in visualization tasks, revealing perceptual gaps and implications for visualization system design.

Contribution

It is the first systematic assessment of ViTs' performance on graphical perception tasks inspired by foundational human perception studies.

Findings

01

ViTs outperform CNNs in general vision tasks

02

ViTs show limited alignment with human perception in visualization tasks

03

Identifies perceptual gaps in ViTs relevant to visualization applications

Abstract

Vision Transformers, ViTs, have emerged as a powerful alternative to convolutional neural networks, CNNs, in a variety of image-based tasks. While CNNs have previously been evaluated for their ability to perform graphical perception tasks, which are essential for interpreting visualizations, the perceptual capabilities of ViTs remain largely unexplored. In this work, we investigate the performance of ViTs in elementary visual judgment tasks inspired by the foundational studies of Cleveland and McGill, which quantified the accuracy of human perception across different visual encodings. Inspired by their study, we benchmark ViTs against CNNs and human participants in a series of controlled graphical perception tasks. Our results reveal that, although ViTs demonstrate strong performance in general vision tasks, their alignment with human-like graphical perception in the visualization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace Recognition and Perception · Data Visualization and Analytics · Visual perception and processing mechanisms