Do Large Language Models Understand Data Visualization Principles?

Martin Sinnona; Valentin Bonas; Viviana Siless; Emmanuel Iarussi

arXiv:2602.20084·cs.CV·February 24, 2026

Do Large Language Models Understand Data Visualization Principles?

Martin Sinnona, Valentin Bonas, Viviana Siless, Emmanuel Iarussi

PDF

Open Access

TL;DR

This paper systematically evaluates large language and vision-language models on their ability to reason about and enforce data visualization principles, highlighting their potential and current limitations in visual design validation.

Contribution

It introduces a novel evaluation framework for LLMs and VLMs in visualization principle reasoning, including a dataset of Vega-Lite specifications with violations and real charts.

Findings

01

Models can detect some violations but struggle with nuanced principles.

02

Models are more effective at correcting violations than detecting them.

03

Symbolic solvers outperform models on complex visual perception tasks.

Abstract

Data visualization principles, derived from decades of research in design and perception, ensure proper visual communication. While prior work has shown that large language models (LLMs) can generate charts or flag misleading figures, it remains unclear whether they and their vision-language counterparts (VLMs) can reason about and enforce visualization principles directly. Constraint based systems encode these principles as logical rules for precise automated checks, but translating them into formal specifications demands expert knowledge. This motivates leveraging LLMs and VLMs as principle checkers that can reason about visual design directly, bypassing the need for symbolic rule specification. In this paper, we present the first systematic evaluation of both LLMs and VLMs on their ability to reason about visualization principles, using hard verification ground truth derived from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Visualization and Analytics · Multimodal Machine Learning Applications · Model-Driven Software Engineering Techniques