DashboardQA: Benchmarking Multimodal Agents for Question Answering on Interactive Dashboards
Aaryaman Kartha, Ahmed Masry, Mohammed Saidul Islam, Thinh Lang, Shadikur Rahman, Ridwan Mahbub, Mizanur Rahman, Mahir Ahmed, Md Rizwan Parvez, Enamul Hoque, Shafiq Joty

TL;DR
DashboardQA introduces a new benchmark for evaluating multimodal agents' ability to understand and interact with real-world, interactive dashboards, highlighting current limitations and challenges in this emerging field.
Contribution
This paper presents the first benchmark specifically designed to assess GUI agents' comprehension and interaction capabilities with real-world dashboards, filling a critical gap in existing QA benchmarks.
Findings
All evaluated models perform poorly, with top accuracy around 39%.
Current models struggle with grounding, planning, and reasoning on dashboards.
Interactive dashboard reasoning remains a significant challenge for vision-language models.
Abstract
Dashboards are powerful visualization tools for data-driven decision-making, integrating multiple interactive views that allow users to explore, filter, and navigate data. Unlike static charts, dashboards support rich interactivity, which is essential for uncovering insights in real-world analytical workflows. However, existing question-answering benchmarks for data visualizations largely overlook this interactivity, focusing instead on static charts. This limitation severely constrains their ability to evaluate the capabilities of modern multimodal agents designed for GUI-based reasoning. To address this gap, we introduce DashboardQA, the first benchmark explicitly designed to assess how vision-language GUI agents comprehend and interact with real-world dashboards. The benchmark includes 112 interactive dashboards from Tableau Public and 405 question-answer pairs with interactive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
