Evaluating Generalizability of Deep Learning Models Using Indian-COVID-19 CT Dataset
Suba S, Nita Parekh, Ramesh Loganathan, Vikram Pudi, Chinnababu, Sunkavalli

TL;DR
This study evaluates the generalizability of different machine learning models, including deep learning CNNs, on COVID-19 CT scan datasets from India and compares their performance on internal and external data.
Contribution
It introduces the Indian-COVID-19 CT dataset and compares the generalizability of multiple ML models, highlighting the superior external performance of a lightweight CNN.
Findings
Deep learning models perform well on internal datasets (90-99% accuracy).
All models show a performance drop on external Indian dataset (8-19%).
Lightweight CNN outperforms deep models on external data with 88% accuracy.
Abstract
Computer tomography (CT) have been routinely used for the diagnosis of lung diseases and recently, during the pandemic, for detecting the infectivity and severity of COVID-19 disease. One of the major concerns in using ma-chine learning (ML) approaches for automatic processing of CT scan images in clinical setting is that these methods are trained on limited and biased sub-sets of publicly available COVID-19 data. This has raised concerns regarding the generalizability of these models on external datasets, not seen by the model during training. To address some of these issues, in this work CT scan images from confirmed COVID-19 data obtained from one of the largest public repositories, COVIDx CT 2A were used for training and internal vali-dation of machine learning models. For the external validation we generated Indian-COVID-19 CT dataset, an open-source repository containing 3D CT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · AI in cancer detection · Radiomics and Machine Learning in Medical Imaging
MethodsTest · 1x1 Convolution · Average Pooling · Label Smoothing · Max Pooling · Dropout · Convolution · Softmax · Inception-v3 Module · Dense Connections
