AI-Generated Annotations Dataset for Diverse Cancer Radiology Collections in NCI Image Data Commons
Gowtham Krishnan Murugesan, Diana McCrumb, Mariam Aboian, Tej Verma,, Rahul Soni, Fatima Memon, Keyvan Farahani, Linmin Pei, Ulrike Wagner, Andrey, Y. Fedorov, David Clunie, Stephen Moore, Jeff Van Oss

TL;DR
This paper presents a large dataset of AI-generated and radiologist-verified annotations for diverse cancer radiology collections in the NCI Image Data Commons, enhancing the availability of high-quality labeled medical images for research.
Contribution
It introduces a comprehensive, publicly accessible dataset of AI-generated and validated annotations across multiple imaging modalities and cancer types, improving annotation coverage in IDC collections.
Findings
AI-generated annotations cover 11 collections with diverse modalities.
Radiologist review improves annotation accuracy and quality.
All data is standardized in DICOM format for easy integration.
Abstract
The National Cancer Institute (NCI) Image Data Commons (IDC) offers publicly available cancer radiology collections for cloud computing, crucial for developing advanced imaging tools and algorithms. Despite their potential, these collections are minimally annotated; only 4% of DICOM studies in collections considered in the project had existing segmentation annotations. This project increases the quantity of segmentations in various IDC collections. We produced high-quality, AI-generated imaging annotations dataset of tissues, organs, and/or cancers for 11 distinct IDC image collections. These collections contain images from a variety of modalities, including computed tomography (CT), magnetic resonance imaging (MRI), and positron emission tomography (PET). The collections cover various body parts, such as the chest, breast, kidneys, prostate, and liver. A portion of the AI annotations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Artificial Intelligence in Healthcare and Education · Advanced X-ray and CT Imaging
MethodsFocus
