A comprehensive and easy-to-use multi-domain multi-task medical imaging meta-dataset
Stefano Woerner, Arthur Jaques, Christian F. Baumgartner

TL;DR
This paper introduces MedIMeta, a comprehensive, standardized multi-domain, multi-task medical imaging dataset designed to facilitate machine learning research by addressing data scarcity and heterogeneity issues.
Contribution
The paper presents MedIMeta, a large, standardized, multi-domain, multi-task medical imaging dataset that simplifies data usage and promotes research in medical image analysis.
Findings
Demonstrated the dataset's utility through supervised learning baselines.
Validated cross-domain few-shot learning performance.
Showcased dataset standardization benefits.
Abstract
While the field of medical image analysis has undergone a transformative shift with the integration of machine learning techniques, the main challenge of these techniques is often the scarcity of large, diverse, and well-annotated datasets. Medical images vary in format, size, and other parameters and therefore require extensive preprocessing and standardization, for usage in machine learning. Addressing these challenges, we introduce the Medical Imaging Meta-Dataset (MedIMeta), a novel multi-domain, multi-task meta-dataset. MedIMeta contains 19 medical imaging datasets spanning 10 different domains and encompassing 54 distinct medical tasks, all of which are standardized to the same format and readily usable in PyTorch or other ML frameworks. We perform a technical validation of MedIMeta, demonstrating its utility through fully supervised and cross-domain few-shot learning baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging
