Protecting multimodal large language models against misleading visualizations

Jonathan Tonglet; Tinne Tuytelaars; Marie-Francine Moens; Iryna Gurevych

arXiv:2502.20503·cs.CL·April 20, 2026

Protecting multimodal large language models against misleading visualizations

Jonathan Tonglet, Tinne Tuytelaars, Marie-Francine Moens, Iryna Gurevych

PDF

1 Repo

TL;DR

This paper identifies a vulnerability in multimodal large language models where their accuracy drops significantly on misleading visualizations, and proposes effective inference-time methods to improve robustness without sacrificing performance on truthful charts.

Contribution

It uncovers a key vulnerability in MLLMs regarding misleading visualizations and compares six inference-time methods, highlighting two effective approaches to enhance robustness.

Findings

01

MLLM QA accuracy drops to random baseline on misleading charts

02

Two inference-time methods improve accuracy by up to 19.6 percentage points

03

Code and data are made publicly available for further research

Abstract

Visualizations play a pivotal role in daily communication in an increasingly data-driven world. Research on multimodal large language models (MLLMs) for automated chart understanding has accelerated massively, with steady improvements on standard benchmarks. However, for MLLMs to be reliable, they must be robust to misleading visualizations, i.e., charts that distort the underlying data, leading readers to draw inaccurate conclusions. Here, we uncover an important vulnerability: MLLM question-answering (QA) accuracy on misleading visualizations drops on average to the level of the random baseline. To address this, we provide the first comparison of six inference-time methods to improve QA performance on misleading visualizations, without compromising accuracy on non-misleading ones. We find that two methods, table-based QA and redrawing the visualization, are effective, with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.