ChartCards: A Chart-Metadata Generation Framework for Multi-Task Chart Understanding
Yifan Wu, Lutao Yan, Leixian Shen, Yinan Mei, Jiannan Wang, Yuyu Luo

TL;DR
This paper introduces ChartCards, a framework for generating comprehensive chart metadata to facilitate multi-task chart understanding, reducing data collection costs and improving model performance across various chart-related tasks.
Contribution
We propose a unified metadata generation framework, ChartCards, enabling multi-task chart understanding and the creation of MetaChart, a large high-quality dataset for training and evaluation.
Findings
Fine-tuning models on MetaChart improves performance by 5% on average.
Notable gains in text-to-chart retrieval and chart-to-table tasks.
Long-CLIP and Llama 3.2-11B achieve 17% and 28% improvements respectively.
Abstract
The emergence of Multi-modal Large Language Models (MLLMs) presents new opportunities for chart understanding. However, due to the fine-grained nature of these tasks, applying MLLMs typically requires large, high-quality datasets for task-specific fine-tuning, leading to high data collection and training costs. To address this, we propose ChartCards, a unified chart-metadata generation framework for multi-task chart understanding. ChartCards systematically synthesizes various chart information, including data tables, visualization code, visual elements, and multi-dimensional semantic captions. By structuring this information into organized metadata, ChartCards enables a single chart to support multiple downstream tasks, such as text-to-chart retrieval, chart summarization, chart-to-table conversion, chart description, and chart question answering. Using ChartCards, we further construct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Mobile Crowdsensing and Crowdsourcing
MethodsLLaMA
