# Impact of CT dose on AI performance: A comparison of radiomics, deep, and foundation models in a multicentric anthropomorphic phantom study

**Authors:** María Martín Asiain, Mohammadreza Amirian, Oscar Jimenez del Toro, Christoph Aberle, Roger Schaer, Michael Bach, Markus Obmann, Kyriakos Flouris, Henning Müller, Bram Stieltjes, Ender Konukoglu, Vincent Andrearczyk, Adrien Depeursinge

PMC · DOI: 10.1002/mp.70374 · Medical Physics · 2026-03-18

## TL;DR

This study compares how different AI models perform on CT images with varying radiation doses, showing that foundation models like CT-FM are more robust to dose changes than traditional methods.

## Contribution

The study introduces a novel comparison of radiomics, deep learning, and foundation models in handling CT dose variations using both phantom and real patient data.

## Key findings

- CT-FM and SwinUNETR showed the highest stability to CT dose variations compared to radiomics and shallow CNNs.
- CT-FM achieved the highest dose-classification accuracy and organ classification performance on real patient data.
- Radiomic features had limited robustness to dose changes, with lower ICC and classification performance.

## Abstract

Computed tomography (CT) is widely used in clinical practice due to its ability to provide detailed anatomical information. However, variations in radiation dose can affect image quality, potentially compromising the performance and reliability of artificial intelligence (AI) models applied to these images.

To evaluate the robustness of radiomics‐based and deep learning‐based models to variations in CT dose levels using a standardized dataset obtained from a 3D‐printed anthropomorphic phantom simulating liver tissue with anomalies, as well as in the publicly available dataset CT‐ORG with real patient data for organ classification. This study is in an early experimental stage, tested only on retrospective data.

A total of 1378 image series from 649 scans were acquired across 13 scanners from four manufacturers at five dose levels. Features were extracted from six regions of interest (ROIs), representing four liver tissue types (normal, cyst, hemangioma, metastasis), using four methods: PyRadiomics, a shallow convolutional neural network (CNN), SwinUNETR, and a CT foundation model (CT‐FM). Feature stability was assessed using the Intraclass Correlation Coefficient (ICC), while Uniform Manifold Approximation and Projection (UMAP) was employed to evaluate tissue types separability and the influence of scanner variations. Generalizability was tested by training liver tissue classifiers on one dose level and testing on others, alongside a dose classification task (10‐fold cross‐validation) to determine the sensitivity of each method to dose variations. In addition, we compared the four methods in addressing the task of organ classification (10‐fold cross‐validation) with the CT‐ORG dataset containing 140 CT scans acquired with varying dose levels.

Radiomic features showed limited robustness to dose variations, leading to reduced performance in liver tissue classification and the lowest ICC among methods (ICC: 0.8355 ± 0.1705). SwinUNETR and CT‐FM exhibited the highest stability (SwinUNETR ICC: 0.9528 ± 0.0272; CT‐FM ICC: 0.9347 ± 0.0420), clearly above the Shallow CNN (ICC: 0.8416 ± 0.2018). CT‐FM also showed strong generalization across dose levels: its features effectively distinguished between liver tissue types and dose levels simultaneously, without compromising performance in either task. Consistent with these trends in dose sensitivity, CT‐FM obtained the highest dose‐classification accuracy (0.6517 ± 0.0179), whereas SwinUNETR showed the lowest (0.3796 ± 0.0250). These trends were confirmed in the context of organ classification with real patient data on the CT‐ORG dataset, where CT‐FM achieved the highest accuracy (0.965).

The study highlights the limited robustness of traditional radiomics and deep models to CT dose variation and underscores the potential of foundation models like CT‐FM to enable robust clinical applications by mitigating dose‐related variability. This enhanced performance is likely due to the model's pretraining on large and diverse datasets, allowing it to learn robust and generalizable representations across varying acquisition conditions.

## Full-text entities

- **Diseases:** colon carcinoma (MESH:D003110), UMAP (MESH:C567162), Cancer (MESH:D009369), hemangioma (MESH:D006391), colon cancer (MESH:D015179), pulmonary nodules (MESH:D055613), hepatocellular carcinoma (MESH:D006528), hepatic metastases (MESH:D009362), cyst (MESH:D003560), DL (MESH:D007859), Lung Cancer (MESH:D008175), CT (MESH:C000719218)
- **Chemicals:** H&amp;E (MESH:D006371), iodine (MESH:D007455), CT-FM (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12997016/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12997016/full.md

## References

17 references — full list in the complete paper: https://tomesphere.com/paper/PMC12997016/full.md

---
Source: https://tomesphere.com/paper/PMC12997016