# Interpretable hybrid ensemble with attention-based fusion and EAOO-GA optimization for lung cancer detection

**Authors:** Mesfer Al Duhayyim, Murdhy A. Aldawsari, Atef Ismail, Marwa M. Emam

PMC · DOI: 10.1038/s41598-026-37187-6 · 2026-03-03

## TL;DR

This paper presents a new lung cancer detection framework combining deep learning models and a novel optimization algorithm to achieve high accuracy and reliability.

## Contribution

The novel EAOO-GA optimization algorithm and hybrid ensemble framework improve lung cancer classification accuracy and generalization.

## Key findings

- The framework achieves 99.40% accuracy and strong performance metrics on the IQ-OTH/NCCD dataset.
- External validation on LIDC-IDRI dataset confirms 97.9% accuracy and robust generalization.
- SMOTE technique effectively addresses class imbalance, improving model sensitivity.

## Abstract

Lung cancer’s high mortality rate underscores the critical need for early and accurate diagnosis, as late-stage diagnoses often lead to 5-year survival rates as low as 5% compared to 56% for early detection, imposing significant economic burdens on healthcare systems and diminishing patient quality of life. While deep learning models offer promising tools for analyzing Computed Tomography (CT) scans, they often suffer from limitations in generalizability, interpretability, and sensitivity to imbalanced data. This paper introduces SE-FusionEAOO Ensemble, a new robust framework for lung cancer classification. Our approach leverages the strengths of multiple deep learning architectures through a sophisticated two-stage process. First, we construct three powerful feature fusion models by strategically pairing diverse pre-trained networks (DenseNet201/EfficientNetB6, Inception v3/MobileNetV2, DenseNet121/ResNet50), each integrated with Squeeze-and-Excitation (SE) blocks for adaptive feature recalibration. Second, we amalgamate the predictions of these expert models using an intelligently weighted aggregation scheme. The key innovation of our framework is the deployment of a new metaheuristic, the Enhanced Animated Oat Optimization algorithm with Genetic Operators (EAOO-GA), to precisely optimize these ensemble weights, ensuring optimal contribution from each model. To address class imbalance in the IQ-OTH/NCCD lung cancer dataset, we employ the Synthetic Minority Over-sampling Technique (SMOTE), significantly improving the model’s sensitivity to minority classes. Extensive experimental results demonstrate that our framework achieves a state-of-the-art accuracy of 99.40%, with 99.2% precision, 99.5% recall, and 99.3% F1-score, outperforming individual models, conventional ensemble methods, and other metaheuristic optimizers. Additionally, the model was externally validated on the LIDC-IDRI dataset, achieving 97.9% accuracy and 97.8% F1-score, confirming its strong generalization capability across independent clinical domains. The proposed framework provides a highly accurate, reliable, and interpretable tool for automated lung cancer detection.

## Linked entities

- **Diseases:** lung cancer (MONDO:0005138)

## Full-text entities

- **Genes:** SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}
- **Diseases:** XAI (MESH:C538243), squamous cell carcinoma (MESH:D002294), large cell carcinoma (MESH:D018287), NSCLC (MESH:D002289), brain tumor (MESH:D001932), CAD (MESH:C000719218), lung nodule (MESH:D003074), pulmonary nodule (MESH:D055613), SE (MESH:D011595), Lung Cancer (MESH:D008175), lung disease (MESH:D008171), Cancer (MESH:D009369), adenocarcinoma (MESH:D000230), cervical cancer (MESH:D002583), SCLC (MESH:D055752), AOO (MESH:D018288)
- **Chemicals:** cellulose (MESH:D002482), Adam (-)
- **Species:** Homo sapiens (human, species) [taxon 9606], Ebola virus (no rank) [taxon 1570291]

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12960684/full.md

---
Source: https://tomesphere.com/paper/PMC12960684