RealChart2Code: Advancing Chart-to-Code Generation with Real Data and Multi-Task Evaluation

Jiajun Zhang; Yuying Li; Zhixun Li; Xingyu Guo; Jingzhuo Wu; Leqi Zheng; Yiran Yang; Jianke Zhang; Qingbin Li; Shannan Yan; Zhetong Li; Changguo Jia; Junfei Wu; Zilei Wang; Qiang Liu; Liang Wang

arXiv:2603.25804·cs.CL·March 30, 2026

RealChart2Code: Advancing Chart-to-Code Generation with Real Data and Multi-Task Evaluation

Jiajun Zhang, Yuying Li, Zhixun Li, Xingyu Guo, Jingzhuo Wu, Leqi Zheng, Yiran Yang, Jianke Zhang, Qingbin Li, Shannan Yan, Zhetong Li, Changguo Jia, Junfei Wu, Zilei Wang, Qiang Liu, Liang Wang

PDF

1 Repo

TL;DR

RealChart2Code introduces a large-scale benchmark for evaluating vision-language models on complex, real-world chart generation tasks, highlighting current limitations and guiding future research.

Contribution

It is the first benchmark to systematically evaluate chart generation from authentic data and iterative code refinement in a conversational setting.

Findings

01

Significant performance drop of VLMs on complex, real-world charts.

02

Large performance gap between proprietary and open-weight models.

03

State-of-the-art VLMs often fail to accurately generate multi-panel charts.

Abstract

Vision-Language Models (VLMs) have demonstrated impressive capabilities in code generation across various domains. However, their ability to replicate complex, multi-panel visualizations from real-world data remains largely unassessed. To address this gap, we introduce \textbf{\texttt{RealChart2Code}}, a new large-scale benchmark with over 2,800 instances grounded in authentic datasets and featuring tasks with clear analytical intent. Crucially, it is the first benchmark to systematically evaluate chart generation from large-scale raw data and assess iterative code refinement in a multi-turn conversational setting. Our comprehensive evaluation of 14 leading VLMs on \texttt{RealChart2Code} reveals significant performance degradation compared to simpler benchmarks, highlighting their struggles with complex plot structures and authentic data. Our analysis uncovers a substantial performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Speakn0w/RealChart2Code
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.