Are Deep Learning Classification Results Obtained on CT Scans Fair and   Interpretable?

Mohamad M.A. Ashames; Ahmet Demir; Omer N. Gerek; Mehmet Fidan; M.; Bilginer Gulmezoglu; Semih Ergin; Mehmet Koc; Atalay Barkana; Cuneyt Calisir

arXiv:2309.12632·cs.LG·November 16, 2023

Are Deep Learning Classification Results Obtained on CT Scans Fair and Interpretable?

Mohamad M.A. Ashames, Ahmet Demir, Omer N. Gerek, Mehmet Fidan, M., Bilginer Gulmezoglu, Semih Ergin, Mehmet Koc, Atalay Barkana, Cuneyt Calisir

PDF

Open Access

TL;DR

This paper highlights the importance of patient-wise data separation in training deep learning models for CT scan classification, demonstrating that proper data handling improves fairness, interpretability, and real-world performance.

Contribution

It emphasizes the need for strict patient-level separation in training data to ensure fair, interpretable, and accurate deep learning classification results on CT scans.

Findings

01

Models trained with patient-wise separation perform better on new patient data.

02

Heat-map visualizations show more relevant focus on nodules with proper data separation.

03

Traditional data shuffling can lead to misleading accuracy and irrelevant feature learning.

Abstract

Following the great success of various deep learning methods in image and object classification, the biomedical image processing society is also overwhelmed with their applications to various automatic diagnosis cases. Unfortunately, most of the deep learning-based classification attempts in the literature solely focus on the aim of extreme accuracy scores, without considering interpretability, or patient-wise separation of training and test data. For example, most lung nodule classification papers using deep learning randomly shuffle data and split it into training, validation, and test sets, causing certain images from the CT scan of a person to be in the training set, while other images of the exact same person to be in the validation or testing image sets. This can result in reporting misleading accuracy rates and the learning of irrelevant features, ultimately reducing the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRadiomics and Machine Learning in Medical Imaging · Lung Cancer Diagnosis and Treatment · Advanced X-ray and CT Imaging

MethodsFocus