GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models

Jonathan Roberts; Kai Han; Samuel Albanie

arXiv:2408.11817·cs.CV·October 17, 2025

GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models

Jonathan Roberts, Kai Han, Samuel Albanie

PDF

Open Access 3 Datasets

TL;DR

GRAB is a new challenging benchmark designed to evaluate large multimodal models' graph analysis capabilities, highlighting current limitations and guiding future advancements in the field.

Contribution

Introduces GRAB, a synthetic, comprehensive graph analysis benchmark with 3284 questions across multiple tasks, to evaluate and push the limits of large multimodal models.

Findings

01

Current LMMs perform poorly on GRAB, with top score only 21%.

02

GRAB reveals specific areas where models struggle in graph reasoning.

03

Benchmark and lightweight version released to foster progress.

Abstract

Large multimodal models (LMMs) have exhibited proficiencies across many visual tasks. Although numerous well-known benchmarks exist to evaluate model performance, they increasingly have insufficient headroom. As such, there is a pressing need for a new generation of benchmarks challenging enough for the next generation of LMMs. One area that LMMs show potential is graph analysis, specifically, the tasks an analyst might typically perform when interpreting figures such as estimating the mean, intercepts or correlations of functions and data series. In this work, we introduce GRAB, a graph analysis benchmark, fit for current and future frontier LMMs. Our benchmark is predominantly synthetic, ensuring high-quality, noise-free questions. GRAB is comprised of 3284 questions, covering five tasks and 23 graph properties. We evaluate 20 LMMs on GRAB, finding it to be a challenging benchmark,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques