A Comprehensive Study of Modern Architectures and Regularization Approaches on CheXpert5000
Sontje Ihler, Felix Kuhnke, Svenja Spindeldreier

TL;DR
This study evaluates modern neural network architectures and regularization techniques for medical image classification on the CheXpert dataset with only 5,000 labeled images, highlighting the effectiveness of pretraining and regularization in low-data regimes.
Contribution
It provides the first comprehensive comparison of architectures and regularization methods on limited medical imaging data, demonstrating the benefits of pretraining and regularization techniques.
Findings
Pretrained models on ImageNet21k outperform others.
Larger models require fewer training steps.
Regularization improves calibration and accuracy.
Abstract
Computer aided diagnosis (CAD) has gained an increased amount of attention in the general research community over the last years as an example of a typical limited data application - with experiments on labeled 100k-200k datasets. Although these datasets are still small compared to natural image datasets like ImageNet1k, ImageNet21k and JFT, they are large for annotated medical datasets, where 1k-10k labeled samples are much more common. There is no baseline on which methods to build on in the low data regime. In this work we bridge this gap by providing an extensive study on medical image classification with limited annotations (5k). We present a study of modern architectures applied to a fixed low data regime of 5000 images on the CheXpert dataset. Conclusively we find that models pretrained on ImageNet21k achieve a higher AUC and larger models require less training steps. All models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · AI in cancer detection · Machine Learning and Data Classification
MethodsAttention Is All You Need · Linear Layer · Dropout · Layer Normalization · Dense Connections · Multi-Head Attention · Absolute Position Encodings · Adam · Position-Wise Feed-Forward Layer · Vision Transformer
