ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild
Ahmed Masry, Megh Thakkar, Aayush Bajaj, Aaryaman Kartha, Enamul, Hoque, Shafiq Joty

TL;DR
ChartGemma is a new model that improves chart understanding by training directly on chart images, capturing visual trends and patterns, and achieving state-of-the-art results across multiple benchmarks.
Contribution
It introduces a novel training approach for chart reasoning models that uses instruction-tuning data generated from chart images, enhancing performance and generalizability.
Findings
Achieves state-of-the-art results on 5 chart reasoning benchmarks.
Generates more realistic and factually accurate chart summaries.
Outperforms existing models by directly leveraging visual information.
Abstract
Given the ubiquity of charts as a data analysis, visualization, and decision-making tool across industries and sciences, there has been a growing interest in developing pre-trained foundation models as well as general purpose instruction-tuned models for chart understanding and reasoning. However, existing methods suffer crucial drawbacks across two critical axes affecting the performance of chart representation models: they are trained on data generated from underlying data tables of the charts, ignoring the visual trends and patterns in chart images, and use weakly aligned vision-language backbone models for domain-specific training, limiting their generalizability when encountering charts in the wild. We address these important drawbacks and introduce ChartGemma, a novel chart understanding and reasoning model developed over PaliGemma. Rather than relying on underlying data tables,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational Games and Gamification · Sports Analytics and Performance · Open Education and E-Learning
MethodsSparse Evolutionary Training
