From Charts to Code: A Hierarchical Benchmark for Multimodal Models

Jiahao Tang; Henry Hengyuan Zhao; Lijian Wu; Zijian Zhang; Yifei Tao; Dongxing Mao; Yang Wan; Jingru Tan; Min Zeng; Min Li; Alex Jinpeng Wang

arXiv:2510.17932·cs.SE·April 21, 2026

From Charts to Code: A Hierarchical Benchmark for Multimodal Models

Jiahao Tang, Henry Hengyuan Zhao, Lijian Wu, Zijian Zhang, Yifei Tao, Dongxing Mao, Yang Wan, Jingru Tan, Min Zeng, Min Li, Alex Jinpeng Wang

PDF

2 Repos

TL;DR

Chart2Code is a hierarchical benchmark designed to evaluate multimodal models' ability to understand and generate charts, covering tasks from reproduction to complex transformations, with extensive evaluation metrics.

Contribution

This is the first hierarchical benchmark that systematically scales task difficulty for chart understanding and code generation in multimodal models.

Findings

01

State-of-the-art GPT-5 achieves only 0.57 on code correctness

02

Models struggle with complex chart editing tasks

03

Benchmark contains 2,023 tasks across 22 chart types

Abstract

We introduce Chart2Code, a new benchmark for evaluating the chart understanding and code generation capabilities of large multimodal models (LMMs). Chart2Code is explicitly designed from a user-driven perspective, capturing diverse real-world scenarios and progressively increasing task difficulty. It consists of three levels: Level 1 (Chart Reproduction) reproduces charts from a reference figure and user query; Level 2 (Chart Editing) involves complex modifications such as changing chart types or adding elements; and Level 3 (Long-Table to Chart Generation) requires models to transform long, information-dense tables into faithful charts following user instructions. To our knowledge, this is the first hierarchical benchmark that reflects practical chart2code usage while systematically scaling task complexity. In total, Chart2Code contains 2,023 tasks across 22 chart types, paired with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.