# A new hybrid model for enhancing low-dose CT images using EfficientNetV2 and WGAN-GP: a multi-loss approach

**Authors:** Mohammad Hojjat, Mohammad Javad Shayegan

PMC · DOI: 10.1186/s40001-025-03579-z · European Journal of Medical Research · 2025-12-09

## TL;DR

This paper introduces a new deep learning method to improve low-dose CT images by reducing noise while preserving important details for better diagnosis.

## Contribution

A novel hybrid model combining EfficientNetV2 and WGAN-GP with a multi-loss approach for LDCT image denoising.

## Key findings

- The method achieved a PSNR of 33.24 dB and SSIM of 0.92 on the AAPM-Mayo Dataset.
- It outperformed baseline LDCT images by 4.0 dB in PSNR and 0.04 in SSIM.
- The model processed images at 12.5 FPS, suitable for real-time clinical use.

## Abstract

Low-dose computed tomography (LDCT) is widely used for medical imaging due to its reduced radiation exposure. However, LDCT images often suffer from significant noise, which can compromise diagnostic accuracy. This study aims to develop an effective denoising method that preserves critical anatomical structures while reducing noise, using a deep learning approach.

We propose a novel LDCT image denoising method that integrates EfficientNetV2-M as a multi-scale feature extractor with a Wasserstein generative adversarial network with gradient penalty (WGAN-GP). The EfficientNetV2-M backbone (54.1 M parameters, depth scaling 1.2) employs seven stages of MBConv blocks with expansion ratios from 1 to 6, extracting hierarchical features at stages 3, 5, and 7. The model is optimized using three weighted loss functions: adversarial loss (Wasserstein distance), pixel-wise L1 loss (λ₂ = 1.0), and perceptual loss (λ₃ = 0.1). The discriminator employs gradient penalty with coefficient λ = 10 for training stability. Training employed 64 × 64 patches with batch size 128, Adam optimizer (learning rate: 1e-5) on the AAPM-Mayo Dataset. Image quality was assessed using peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM).

The proposed method achieved a PSNR of 33.24 ± 0.15 dB and an SSIM of 0.92 ± 0.005 on the AAPM-Mayo Dataset across 10 independent runs, representing improvements of 4.0 dB and 0.04 over baseline LDCT images. Inference speed reached 12.5 FPS (0.08 s per 512 × 512 image) on NVIDIA Tesla T4 GPU, meeting real-time clinical requirements.

Our EfficientNetV2-WGAN-GP-based method provides a robust solution for LDCT image denoising, significantly improving image clarity while maintaining diagnostic structures. This approach holds potential for enhancing diagnostic accuracy and improving patient safety in clinical practice.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12802201/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12802201/full.md

## References

5 references — full list in the complete paper: https://tomesphere.com/paper/PMC12802201/full.md

---
Source: https://tomesphere.com/paper/PMC12802201