RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering

L\'eo Butsanets; Charles Corbi\`ere; Julien Khlaut; Pierre Manceron; Corentin Dancette

arXiv:2512.17396·cs.CV·March 31, 2026

RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering

L\'eo Butsanets, Charles Corbi\`ere, Julien Khlaut, Pierre Manceron, Corentin Dancette

PDF

1 Repo

TL;DR

RadImageNet-VQA introduces a comprehensive large-scale dataset for radiologic VQA, enabling advancements in medical image understanding and diagnosis through diverse tasks and extensive annotations.

Contribution

This work provides the first large-scale, expert-annotated CT and MRI dataset for radiologic VQA, covering multiple tasks and ensuring robustness against text-based shortcuts.

Findings

01

State-of-the-art models struggle with fine-grained pathology identification.

02

Model performance drops to near-random without image inputs.

03

Dataset is free from linguistic shortcuts and publicly available.

Abstract

In this work, we introduce RadImageNet-VQA, a large-scale dataset designed to advance radiologic visual question answering (VQA) on CT and MRI exams. Existing medical VQA datasets are limited in scale, dominated by X-ray imaging or biomedical illustrations, and often prone to text-based shortcuts. RadImageNet-VQA is built from expert-curated annotations and provides 750K images paired with 7.5M question-answer samples. It covers three key tasks - abnormality detection, anatomy recognition, and pathology identification - spanning eight anatomical regions and 97 pathology categories, and supports open-ended, closed-ended, and multiple-choice questions. Extensive experiments show that state-of-the-art vision-language models still struggle with fine-grained pathology identification, particularly in open-ended settings and even after fine-tuning. Text-only analysis further reveals that model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://huggingface.co/datasets/raidium/RadImageNet-VQA
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.