Does It Run and Is That Enough? Revisiting Text-to-Chart Generation with a Multi-Agent Approach

James Ford; Anthony Rios

arXiv:2506.06175·cs.CL·June 9, 2025

Does It Run and Is That Enough? Revisiting Text-to-Chart Generation with a Multi-Agent Approach

James Ford, Anthony Rios

PDF

Open Access

TL;DR

This paper introduces a multi-agent pipeline using GPT-4o-mini to significantly reduce execution errors in text-to-chart generation, highlighting that current benchmarks mask issues like hallucinations and accessibility.

Contribution

It proposes a lightweight multi-agent approach that improves execution success rates and reveals the need to focus on chart quality and accessibility beyond mere execution.

Findings

01

Execution error rate reduced to 4.5% on Text2Chart31

02

System outperforms fine-tuned baseline by nearly 5 percentage points

03

Manual review uncovers issues with hallucinations and accessibility compliance

Abstract

Large language models can translate natural-language chart descriptions into runnable code, yet approximately 15\% of the generated scripts still fail to execute, even after supervised fine-tuning and reinforcement learning. We investigate whether this persistent error rate stems from model limitations or from reliance on a single-prompt design. To explore this, we propose a lightweight multi-agent pipeline that separates drafting, execution, repair, and judgment, using only an off-the-shelf GPT-4o-mini model. On the \textsc{Text2Chart31} benchmark, our system reduces execution errors to 4.5\% within three repair iterations, outperforming the strongest fine-tuned baseline by nearly 5 percentage points while requiring significantly less compute. Similar performance is observed on the \textsc{ChartX} benchmark, with an error rate of 4.6\%, demonstrating strong generalization. Under…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Data Classification

MethodsFocus