ChartInsights: Evaluating Multimodal Large Language Models for Low-Level   Chart Question Answering

Yifan Wu; Lutao Yan; Leixian Shen; Yunhai Wang; Nan Tang; Yuyu Luo

arXiv:2405.07001·cs.CL·November 7, 2024·1 cites

ChartInsights: Evaluating Multimodal Large Language Models for Low-Level Chart Question Answering

Yifan Wu, Lutao Yan, Leixian Shen, Yunhai Wang, Nan Tang, Yuyu Luo

PDF

Open Access 1 Video

TL;DR

This paper evaluates multimodal large language models on low-level chart question answering tasks using a new dataset, revealing current limitations and proposing prompt strategies that significantly improve model accuracy.

Contribution

It introduces ChartInsights, a comprehensive dataset for low-level ChartQA, and proposes prompt strategies that enhance MLLM performance in this domain.

Findings

01

GPT-4o achieves 69.17% accuracy on average.

02

Prompt strategies improve accuracy by over 14%.

03

Visual element manipulation impacts model performance.

Abstract

Chart question answering (ChartQA) tasks play a critical role in interpreting and extracting insights from visualization charts. While recent advancements in multimodal large language models (MLLMs) like GPT-4o have shown promise in high-level ChartQA tasks, such as chart captioning, their effectiveness in low-level ChartQA tasks (e.g., identifying correlations) remains underexplored. In this paper, we address this gap by evaluating MLLMs on low-level ChartQA using a newly curated dataset, ChartInsights, which consists of 22,347 (chart, task, query, answer) covering 10 data analysis tasks across 7 chart types. We systematically evaluate 19 advanced MLLMs, including 12 open-source and 7 closed-source models. The average accuracy rate across these models is 39.8%, with GPT-4o achieving the highest accuracy at 69.17%. To further explore the limitations of MLLMs in low-level ChartQA, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

ChartInsights: Evaluating Multimodal Large Language Models for Low-Level Chart Question Answering· underline

Taxonomy

TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Topic Modeling