ChartVerse: Scaling Chart Reasoning via Reliable Programmatic Synthesis from Scratch

Zheng Liu; Honglin Lin; Chonghan Qin; Xiaoyang Wang; Xin Gao; Yu Li; Mengzhang Cai; Yun Zhu; Zhanping Zhong; Qizhi Pei; Zhuoshi Pan; Xiaoran Shang; Bin Cui; Conghui He; Wentao Zhang; Lijun Wu

arXiv:2601.13606·cs.CV·April 30, 2026

ChartVerse: Scaling Chart Reasoning via Reliable Programmatic Synthesis from Scratch

Zheng Liu, Honglin Lin, Chonghan Qin, Xiaoyang Wang, Xin Gao, Yu Li, Mengzhang Cai, Yun Zhu, Zhanping Zhong, Qizhi Pei, Zhuoshi Pan, Xiaoran Shang, Bin Cui, Conghui He, Wentao Zhang, Lijun Wu

PDF

2 Repos 4 Models 5 Datasets

TL;DR

ChartVerse introduces a scalable framework for synthesizing complex, high-quality chart reasoning data from scratch, enabling improved vision-language models for complex chart understanding.

Contribution

The paper presents novel metrics, a complexity-aware chart synthesis method, and a truth-anchored inverse QA approach to generate rigorous reasoning data.

Findings

01

ChartVerse-8B surpasses its teacher and rivals larger models in chart reasoning tasks.

02

The framework effectively synthesizes diverse, high-complexity charts with rigorous reasoning.

03

State-of-the-art performance achieved on chart reasoning benchmarks.

Abstract

Chart reasoning is a critical capability for Vision Language Models (VLMs). However, the development of open-source models is severely hindered by the lack of high-quality training data. Existing datasets suffer from a dual challenge: synthetic charts are often simplistic and repetitive, while the associated QA pairs are prone to hallucinations and lack the reasoning depth required for complex tasks. To bridge this gap, we propose ChartVerse, a scalable framework designed to synthesize complex charts and reliable reasoning data from scratch. (1) To address the bottleneck of simple patterns, we first introduce Rollout Posterior Entropy (RPE), a novel metric that quantifies chart complexity. Guided by RPE, we develop complexity-aware chart coder to autonomously synthesize diverse, high-complexity charts via executable programs. (2) To guarantee reasoning rigor, we develop truth-anchored…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.