RealCQA-V2: A Diagnostic Benchmark for Structured Visual Entailment over Scientific Charts
Saleem Ahmed, Srirangaraj Setlur, Venu Govindaraju

TL;DR
RealCQA-V2 is a new benchmark for structured visual entailment in scientific charts, enabling atomic reasoning verification and diagnosing multimodal reasoning capabilities of models.
Contribution
It introduces a structured logical entailment task over chart elements, with chain-level metrics, to evaluate and diagnose multimodal reasoning in scientific chart understanding.
Findings
Models verify individual premises well but struggle with full chain coherence.
RealCQA-V2 provides a reproducible benchmark for structured visual entailment.
Baseline LVLMs show a reasoning gap between local premise verification and global chain coherence.
Abstract
Multimodal reasoning models often produce fluent answers supported by seemingly coherent rationales. Existing benchmarks evaluate only final-answer correctness. They do not support atomic visual entailment verification of intermediate steps, especially visual compositional logic. This limitation is especially acute in scientific chart understanding, where answers depend on deterministically grounded visual semantics such as axes, legends, and quantitative relations. We introduce RealCQA-V2, a large-scale benchmark that reformulates chart question answering as Visual Premise Proving (VPP): a structured logical entailment task over chart-grounded visual predicates. Each question is deconstructed into manually curated, atomic premises grounded in chart elements (axes, legends, marks, and quantitative relations), yielding executable reasoning chains rather than free-form textual rationales.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
