Rethinking model prototyping through the MedMNIST+ dataset collection
Sebastian Doerrich, Francesco Di Salvo, Julius Brockmann, Christian, Ledig

TL;DR
This paper introduces a comprehensive benchmark for the MedMNIST+ dataset collection, systematically evaluating CNNs and ViTs to improve model development and prototyping in medical imaging.
Contribution
It presents a standardized evaluation framework for MedMNIST+ and reassesses model effectiveness, highlighting efficient training schemes and the limited impact of higher resolutions.
Findings
Efficient training schemes and foundation models are viable alternatives to costly training.
Higher image resolutions do not always improve performance significantly.
CNNs remain competitive with Vision Transformers in medical imaging tasks.
Abstract
The integration of deep learning based systems in clinical practice is often impeded by challenges rooted in limited and heterogeneous medical datasets. In addition, the field has increasingly prioritized marginal performance gains on a few, narrowly scoped benchmarks over clinical applicability, slowing down meaningful algorithmic progress. This trend often results in excessive fine-tuning of existing methods on selected datasets rather than fostering clinically relevant innovations. In response, this work introduces a comprehensive benchmark for the MedMNIST+ dataset collection, designed to diversify the evaluation landscape across several imaging modalities, anatomical regions, classification tasks and sample sizes. We systematically reassess commonly used Convolutional Neural Networks (CNNs) and Vision Transformer (ViT) architectures across distinct medical datasets, training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Machine Learning in Healthcare
MethodsAttention Is All You Need · Linear Layer · Label Smoothing · Position-Wise Feed-Forward Layer · Layer Normalization · Byte Pair Encoding · Softmax · Multi-Head Attention · Dense Connections · Absolute Position Encodings
