MultiChartQA: Benchmarking Vision-Language Models on Multi-Chart Problems
Zifeng Zhu, Mengzhao Jia, Zhihan Zhang, Lang Li, Meng Jiang

TL;DR
MultiChartQA is a new benchmark designed to evaluate multimodal large language models on complex multi-chart reasoning tasks, addressing gaps in existing single-chart focused benchmarks and highlighting current performance limitations.
Contribution
The paper introduces MultiChartQA, a comprehensive benchmark for multi-chart reasoning in vision-language models, emphasizing multi-hop, comparative, and sequential reasoning capabilities.
Findings
MLLMs show significant performance gaps compared to humans.
Current models struggle with multi-chart multi-hop reasoning.
MultiChartQA reveals key challenges in multi-chart comprehension.
Abstract
Multimodal Large Language Models (MLLMs) have demonstrated impressive abilities across various tasks, including visual question answering and chart comprehension, yet existing benchmarks for chart-related tasks fall short in capturing the complexity of real-world multi-chart scenarios. Current benchmarks primarily focus on single-chart tasks, neglecting the multi-hop reasoning required to extract and integrate information from multiple charts, which is essential in practical applications. To fill this gap, we introduce MultiChartQA, a benchmark that evaluates MLLMs' capabilities in four key areas: direct question answering, parallel question answering, comparative reasoning, and sequential reasoning. Our evaluation of a wide range of MLLMs reveals significant performance gaps compared to humans. These results highlight the challenges in multi-chart comprehension and the potential of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsConstraint Satisfaction and Optimization
MethodsFocus
