RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis
Xiaoman Zhang, Chaoyi Wu, Ziheng Zhao, Jiayu Lei, Ya Zhang, Yanfeng, Wang, Weidi Xie

TL;DR
RadGenome-Chest CT is a large-scale, region-guided 3D chest CT dataset with detailed annotations, reports, and VQA pairs, designed to advance multimodal foundation models in medical imaging.
Contribution
The paper introduces RadGenome-Chest CT, a comprehensive dataset with organ segmentation, grounded reports, and VQA pairs, enhancing multimodal AI research in chest CT analysis.
Findings
Dataset includes 25,692 CT volumes and reports from 20,000 patients.
Contains 197 organ segmentation categories and 665K grounded reports.
Provides 1.3 million grounded VQA pairs for multimodal learning.
Abstract
Developing generalist foundation model has recently attracted tremendous attention among researchers in the field of AI for Medicine (AI4Medicine). A pivotal insight in developing these models is their reliance on dataset scaling, which emphasizes the requirements on developing open-source medical image datasets that incorporate diverse supervision signals across various imaging modalities. In this paper, we introduce RadGenome-Chest CT, a comprehensive, large-scale, region-guided 3D chest CT interpretation dataset based on CT-RATE. Specifically, we leverage the latest powerful universal segmentation and large language models, to extend the original datasets (over 25,692 non-contrast 3D chest CT volume and reports from 20,000 patients) from the following aspects: (i) organ-level segmentation masks covering 197 categories, which provide intermediate reasoning visual clues for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLung Cancer Diagnosis and Treatment · COVID-19 diagnosis using AI · Radiomics and Machine Learning in Medical Imaging
MethodsSparse Evolutionary Training
