Does deep learning model calibration improve performance in class-imbalanced medical image classification?
Sivaramakrishnan Rajaraman, Prasanth Ganesan, Sameer Antani

TL;DR
This study systematically analyzes how model calibration affects deep learning performance in class-imbalanced medical image classification, showing calibration improves results at default thresholds but not at optimal thresholds.
Contribution
It provides a comprehensive analysis of calibration effects across different medical imaging modalities, imbalance levels, and thresholds, clarifying when calibration is beneficial.
Findings
Calibration improves performance at default threshold of 0.5.
No significant benefit of calibration at PR-guided threshold.
Effects consistent across chest X-ray and fundus image datasets.
Abstract
In medical image classification tasks, it is common to find that the number of normal samples far exceeds the number of abnormal samples. In such class-imbalanced situations, reliable training of deep neural networks continues to be a major challenge. Under these circumstances, the predicted class probabilities may be biased toward the majority class. Calibration has been suggested to alleviate some of these effects. However, there is insufficient analysis explaining when and whether calibrating a model would be beneficial in improving performance. In this study, we perform a systematic analysis of the effect of model calibration on its performance on two medical image modalities, namely, chest X-rays and fundus images, using various deep learning classifier backbones. For this, we study the following variations: (i) the degree of imbalances in the dataset used for training; (ii)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Imbalanced Data Classification Techniques · Digital Imaging for Blood Diseases
