# Performance of AI Approaches for COVID-19 Diagnosis Using Chest CT Scans: The Impact of Architecture and Dataset

**Authors:** Astha Jaiswal, Philipp Fervers, Fanyang Meng, Huimao Zhang, Dorottya Móré, Athanasios Giannakis, Jasmin Wailzer, Andreas Michael Bucher, David Maintz, Jonathan Kottlors, Rahil Shahzad, Thorsten Persigehl

PMC · DOI: 10.1055/a-2577-3928 · 2025-04-29

## TL;DR

This study compares different AI models for diagnosing COVID-19 using chest CT scans and finds that the training data has a bigger impact on performance than the model architecture.

## Contribution

The study evaluates three AI architectures on a diverse, multicenter dataset to determine the impact of data and model design on diagnostic accuracy.

## Key findings

- AI models showed high specificity but moderate sensitivity in diagnosing COVID-19 from CT scans.
- The training data had a greater impact on model performance than the model architecture.
- AI models should be used as a tool to assist radiologists, not as standalone diagnostic tools.

## Abstract

AI is emerging as a promising tool for diagnosing COVID-19 based on chest CT scans. The aim of this study was the comparison of AI models for COVID-19 diagnosis. Therefore, we: (1) trained three distinct AI models for classifying COVID-19 and non-COVID-19 pneumonia (nCP) using a large, clinically relevant CT dataset, (2) evaluated the models’ performance using an independent test set, and (3) compared the models both algorithmically and experimentally.

In this multicenter multi-vendor study, we collected n=1591 chest CT scans of COVID-19 (n=762) and nCP (n=829) patients from China and Germany. In Germany, the data was collected from three RACOON sites. We trained and validated three COVID-19 AI models with different architectures: COVNet based on 2D-CNN, DeCoVnet based on 3D-CNN, and AD3D-MIL based on 3D-CNN with attention module. 991 CT scans were used for training the AI models using 5-fold cross-validation. 600 CT scans from 6 different centers were used for independent testing. The models’ performance was evaluated using accuracy (Acc), sensitivity (Se), and specificity (Sp).

The average validation accuracy of the COVNet, DeCoVnet, and AD3D-MIL models over the 5 folds was 80.9%, 82.0%, and 84.3%, respectively. On the independent test set with n=600 CT scans, COVNet yielded Acc=76.6%, Se=67.8%, Sp=85.7%; DeCoVnet provided Acc=75.1%, Se=61.2%, Sp=89.7%; and AD3D-MIL achieved Acc=73.9%, Se=57.7%, Sp=90.8%.

The classification performance of the evaluated AI models is highly dependent on the training data rather than the architecture itself. Our results demonstrate a high specificity and moderate sensitivity. The AI classification models should not be used unsupervised but could potentially assist radiologists in COVID-19 and nCP identification.

This study compares AI approaches for diagnosing COVID-19 in chest CT scans, which is essential for further optimizing the delivery of healthcare and for pandemic preparedness.

Our experiments using a multicenter, multi-vendor, diverse dataset show that the training data is the key factor in determining the diagnostic performance.

The AI models should not be used unsupervised but as a tool to assist radiologists.

Jaiswal A, Fervers P, Meng F et al. Performance of AI Approaches for COVID-19 Diagnosis Using Chest CT Scans: The Impact of Architecture and Dataset. Rofo 2026; 198: 185–198

## Linked entities

- **Diseases:** COVID-19 (MONDO:0100096), pneumonia (MONDO:0005249)

## Full-text entities

- **Diseases:** COVID-19 (MESH:D000086382)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12851823/full.md

---
Source: https://tomesphere.com/paper/PMC12851823