Does It Run and Is That Enough? Revisiting Text-to-Chart Generation with a Multi-Agent Approach
James Ford, Anthony Rios

TL;DR
This paper introduces a multi-agent pipeline using GPT-4o-mini to significantly reduce execution errors in text-to-chart generation, highlighting that current benchmarks mask issues like hallucinations and accessibility.
Contribution
It proposes a lightweight multi-agent approach that improves execution success rates and reveals the need to focus on chart quality and accessibility beyond mere execution.
Findings
Execution error rate reduced to 4.5% on Text2Chart31
System outperforms fine-tuned baseline by nearly 5 percentage points
Manual review uncovers issues with hallucinations and accessibility compliance
Abstract
Large language models can translate natural-language chart descriptions into runnable code, yet approximately 15\% of the generated scripts still fail to execute, even after supervised fine-tuning and reinforcement learning. We investigate whether this persistent error rate stems from model limitations or from reliance on a single-prompt design. To explore this, we propose a lightweight multi-agent pipeline that separates drafting, execution, repair, and judgment, using only an off-the-shelf GPT-4o-mini model. On the \textsc{Text2Chart31} benchmark, our system reduces execution errors to 4.5\% within three repair iterations, outperforming the strongest fine-tuned baseline by nearly 5 percentage points while requiring significantly less compute. Similar performance is observed on the \textsc{ChartX} benchmark, with an error rate of 4.6\%, demonstrating strong generalization. Under…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Data Classification
MethodsFocus
