Understanding Graphical Perception in Data Visualization through   Zero-shot Prompting of Vision-Language Models

Grace Guo; Jenna Jiayi Kang; Raj Sanjay Shah; Hanspeter Pfister,; Sashank Varma

arXiv:2411.00257·cs.AI·November 4, 2024

Understanding Graphical Perception in Data Visualization through Zero-shot Prompting of Vision-Language Models

Grace Guo, Jenna Jiayi Kang, Raj Sanjay Shah, Hanspeter Pfister,, Sashank Varma

PDF

Open Access

TL;DR

This paper evaluates how vision-language models perform on graphical perception tasks in a zero-shot setting, comparing their performance to humans and analyzing their sensitivity to stylistic variations.

Contribution

It establishes the potential of VLMs to model human-like chart comprehension and highlights their sensitivity to stylistic changes in visualizations.

Findings

01

VLMs perform similarly to humans under certain conditions

02

VLM accuracy varies with stylistic changes like color and contiguity

03

Zero-shot prompting can reveal human-like perception patterns in VLMs

Abstract

Vision Language Models (VLMs) have been successful at many chart comprehension tasks that require attending to both the images of charts and their accompanying textual descriptions. However, it is not well established how VLM performance profiles map to human-like behaviors. If VLMs can be shown to have human-like chart comprehension abilities, they can then be applied to a broader range of tasks, such as designing and evaluating visualizations for human readers. This paper lays the foundations for such applications by evaluating the accuracy of zero-shot prompting of VLMs on graphical perception tasks with established human performance profiles. Our findings reveal that VLMs perform similarly to humans under specific task and style combinations, suggesting that they have the potential to be used for modeling human performance. Additionally, variations to the input stimuli show that VLM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Visualization and Analytics