ChartComplete: A Taxonomy-based Inclusive Chart Dataset
Ahmad Mustapha, Charbel Toumieh, and Mariette Awad

TL;DR
ChartComplete is a comprehensive dataset covering thirty chart types, designed to advance chart understanding by providing a broad, taxonomy-based collection of classified chart images for benchmarking and research.
Contribution
It introduces a new, extensive chart dataset based on visualization taxonomy, addressing the limited scope of existing datasets for multimodal large language models.
Findings
Provides a diverse, taxonomy-based chart dataset
Enables more comprehensive evaluation of chart understanding models
Fills a gap in existing chart datasets
Abstract
With advancements in deep learning (DL) and computer vision techniques, the field of chart understanding is evolving rapidly. In particular, multimodal large language models (MLLMs) are proving to be efficient and accurate in understanding charts. To accurately measure the performance of MLLMs, the research community has developed multiple datasets to serve as benchmarks. By examining these datasets, we found that they are all limited to a small set of chart types. To bridge this gap, we propose the ChartComplete dataset. The dataset is based on a chart taxonomy borrowed from the visualization community, and it covers thirty different chart types. The dataset is a collection of classified chart images and does not include a learning signal. We present the ChartComplete dataset as is to the community to build upon it.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Multimodal Machine Learning Applications · Artificial Intelligence Applications
