Charting the Future: Using Chart Question-Answering for Scalable Evaluation of LLM-Driven Data Visualizations
James Ford, Xingmeng Zhao, Dan Schumacher, and Anthony Rios

TL;DR
This paper introduces a VQA-based framework for scalable evaluation of LLM-generated data visualizations, highlighting current limitations and potential for rapid, automated assessment of chart quality.
Contribution
It presents a novel VQA-driven evaluation method for LLM-generated charts, enabling scalable and effective assessment of visual communication quality.
Findings
LLM-generated charts lag behind original charts in accuracy according to VQA metrics.
Few-shot prompting improves the quality of LLM-generated visualizations.
Significant progress is needed before LLMs can produce charts matching human quality.
Abstract
We propose a novel framework that leverages Visual Question Answering (VQA) models to automate the evaluation of LLM-generated data visualizations. Traditional evaluation methods often rely on human judgment, which is costly and unscalable, or focus solely on data accuracy, neglecting the effectiveness of visual communication. By employing VQA models, we assess data representation quality and the general communicative clarity of charts. Experiments were conducted using two leading VQA benchmark datasets, ChartQA and PlotQA, with visualizations generated by OpenAI's GPT-3.5 Turbo and Meta's Llama 3.1 70B-Instruct models. Our results indicate that LLM-generated charts do not match the accuracy of the original non-LLM-generated charts based on VQA performance measures. Moreover, while our results demonstrate that few-shot prompting significantly boosts the accuracy of chart generation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Scientific Computing and Data Management
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Cosine Annealing · Layer Normalization · Linear Warmup With Cosine Annealing · Adam · Linear Layer · Residual Connection · Weight Decay
