TL;DR
MedMNIST offers a standardized, lightweight collection of 10 diverse medical image datasets for classification tasks, facilitating benchmarking and development of AutoML algorithms in medical imaging.
Contribution
This paper introduces MedMNIST, a comprehensive, pre-processed dataset collection and benchmarking platform for AutoML in medical image analysis, emphasizing simplicity and diversity.
Findings
Baseline AutoML methods evaluated on MedMNIST datasets.
MedMNIST datasets are publicly available for research and benchmarking.
The benchmark facilitates rapid prototyping and educational use in medical imaging.
Abstract
We present MedMNIST, a collection of 10 pre-processed medical open datasets. MedMNIST is standardized to perform classification tasks on lightweight 28x28 images, which requires no background knowledge. Covering the primary data modalities in medical image analysis, it is diverse on data scale (from 100 to 100,000) and tasks (binary/multi-class, ordinal regression and multi-label). MedMNIST could be used for educational purpose, rapid prototyping, multi-modal machine learning or AutoML in medical image analysis. Moreover, MedMNIST Classification Decathlon is designed to benchmark AutoML algorithms on all 10 datasets; We have compared several baseline methods, including open-source or commercial AutoML tools. The datasets, evaluation code and baseline methods for MedMNIST are publicly available at https://medmnist.github.io/.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
