MultiChartQA: Benchmarking Vision-Language Models on Multi-Chart   Problems

Zifeng Zhu; Mengzhao Jia; Zhihan Zhang; Lang Li; Meng Jiang

arXiv:2410.14179·cs.CL·February 11, 2025

MultiChartQA: Benchmarking Vision-Language Models on Multi-Chart Problems

Zifeng Zhu, Mengzhao Jia, Zhihan Zhang, Lang Li, Meng Jiang

PDF

Open Access 1 Repo 1 Video

TL;DR

MultiChartQA is a new benchmark designed to evaluate multimodal large language models on complex multi-chart reasoning tasks, addressing gaps in existing single-chart focused benchmarks and highlighting current performance limitations.

Contribution

The paper introduces MultiChartQA, a comprehensive benchmark for multi-chart reasoning in vision-language models, emphasizing multi-hop, comparative, and sequential reasoning capabilities.

Findings

01

MLLMs show significant performance gaps compared to humans.

02

Current models struggle with multi-chart multi-hop reasoning.

03

MultiChartQA reveals key challenges in multi-chart comprehension.

Abstract

Multimodal Large Language Models (MLLMs) have demonstrated impressive abilities across various tasks, including visual question answering and chart comprehension, yet existing benchmarks for chart-related tasks fall short in capturing the complexity of real-world multi-chart scenarios. Current benchmarks primarily focus on single-chart tasks, neglecting the multi-hop reasoning required to extract and integrate information from multiple charts, which is essential in practical applications. To fill this gap, we introduce MultiChartQA, a benchmark that evaluates MLLMs' capabilities in four key areas: direct question answering, parallel question answering, comparative reasoning, and sequential reasoning. Our evaluation of a wide range of MLLMs reveals significant performance gaps compared to humans. These results highlight the challenges in multi-chart comprehension and the potential of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zivenzhu/multi-chart-qa
pytorchOfficial

Videos

MultiChartQA: Benchmarking Vision-Language Models on Multi-Chart Problems· underline

Taxonomy

TopicsConstraint Satisfaction and Optimization

MethodsFocus