Do MLLMs See What We See? Analyzing Visualization Literacy Barriers in AI Systems
Mengli (Dawn) Duan, Yuhe (Sissi) Jiang, Matthew Varona, and Carolina Nobre

TL;DR
This paper systematically analyzes why multimodal large language models struggle with visualizations, revealing specific barriers and providing insights to improve their interpretative capabilities in visualization literacy.
Contribution
It introduces the first benchmark and taxonomy for analyzing visualization literacy barriers in MLLMs, highlighting machine-specific challenges beyond human limitations.
Findings
Models excel on simple charts but struggle with complex, color-rich visualizations.
Major failure modes include inconsistent comparative reasoning.
Identifies two machine-specific barriers extending human visualization frameworks.
Abstract
Multimodal Large Language Models (MLLMs) are increasingly used to interpret visualizations, yet little is known about why they fail. We present the first systematic analysis of barriers to visualization literacy in MLLMs. Using the regenerated Visualization Literacy Assessment Test (reVLAT) benchmark with synthetic data, we open-coded 309 erroneous responses from four state-of-the-art models with a barrier-centric strategy adapted from human visualization literacy research. Our analysis yields a taxonomy of MLLM failures, revealing two machine-specific barriers that extend prior human-participation frameworks. Results show that models perform well on simple charts but struggle with color-intensive, segment-based visualizations, often failing to form consistent comparative reasoning. Our findings inform future evaluation and design of reliable AI-driven visualization assistants.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Multimodal Machine Learning Applications · Computational and Text Analysis Methods
