Open-radiomics: A Collection of Standardized Datasets and a Technical Protocol for Reproducible Radiomics Machine Learning Pipelines
Khashayar Namdar, Matthias W. Wagner, Birgit B. Ertl-Wagner, Farzad, Khalvati

TL;DR
This paper introduces open-radiomics, a standardized collection of datasets and a protocol to enhance reproducibility in radiomics machine learning pipelines, demonstrating how variability sources affect model performance.
Contribution
It provides a comprehensive, standardized radiomics dataset collection and a technical protocol to improve reproducibility and understand variability sources in radiomics ML pipelines.
Findings
Tumor subregion and imaging sequence significantly impact model performance.
Highest AUROC achieved was 0.951 with specific MRI sequence and tumor subregion.
Superficial perfect performances are often irreproducible due to variability sources.
Abstract
Background: As an important branch of machine learning pipelines in medical imaging, radiomics faces two major challenges namely reproducibility and accessibility. In this work, we introduce open-radiomics, a set of radiomics datasets along with a comprehensive radiomics pipeline based on our proposed technical protocol to improve the reproducibility of the results. Methods: We curated large-scale radiomics datasets based on three open-source datasets; BraTS 2020 for high-grade glioma (HGG) versus low-grade glioma (LGG) classification and survival analysis, BraTS 2023 for O6-methylguanine-DNA methyltransferase classification, and non-small cell lung cancer survival analysis from the Cancer Imaging Archive. Using BraTS 2020 Magnetic Resonance Imaging (MRI) dataset, we applied our protocol to 369 brain tumor patients (76 LGG, 293 HGG). Leveraging PyRadiomics for LGG vs. HGG…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Glioma Diagnosis and Treatment · Colorectal Cancer Treatments and Studies
MethodsLib
