# Improving Tree-Based Lung Disease Classification from Chest X-Ray Images Using Deep Feature Representations

**Authors:** Abdulaziz A. Alsulami, Qasem Abu Al-Haija, Rayed Alakhtar, Huda Alsobhi, Rayan A. Alsemmeari, Badraddin Alturki, Ahmad J. Tayeb

PMC · DOI: 10.3390/bioengineering13030267 · Bioengineering · 2026-02-25

## TL;DR

This paper introduces a hybrid deep learning and tree-based model for classifying lung diseases in chest X-rays, offering high accuracy and efficiency suitable for real-world use.

## Contribution

A novel hybrid CNN–tree framework that combines deep feature extraction with interpretable tree classifiers for efficient and accurate lung disease classification.

## Key findings

- Tree-based classifiers achieved F1-scores between 0.977 and 0.982 using deep features from a fine-tuned ResNet-18.
- The framework reduces inter-class confusion and maintains low inference latency for scalable deployment.
- Combining deep features with interpretable models improves robustness and generalization across diverse datasets.

## Abstract

Healthcare systems worldwide face increasing pressure to deliver accurate, affordable, and scalable diagnostic services while maintaining long-term sustainability. Chest X-ray screening is considered one of the most cost-effective methods for detecting lung disease. However, many deep learning approaches are computationally intensive and difficult to interpret, which limits their adoption in high-throughput, resource-constrained clinical settings. This study proposes a hybrid CNN–tree framework for automated lung disease classification from chest X-ray images, which targets COVID-19, pneumonia, tuberculosis, lung cancer, and normal cases. To ensure robustness and generalization, four publicly available chest X-ray datasets from different sources are merged into a unified five-class dataset, which introduces realistic variations in imaging conditions and patient populations. A ResNet-18 model is fine-tuned to extract domain-specific deep feature representations. Feature dimensionality and redundancy are reduced using Principal Component Analysis, while class imbalance is addressed through the Synthetic Minority Over-sampling Technique. The resulting compact feature vectors are used to train interpretable tree-based classifiers, which include Decision Tree, Random Forest, and XGBoost. Experiments conducted using five-fold stratified cross-validation demonstrate substantial and consistent performance gains. When trained on fine-tuned and preprocessed deep features, all evaluated tree-based classifiers achieve weighted F1-scores between 0.977 and 0.982 using five-fold cross-validation, with a significant reduction in inter-class confusion. In addition, the proposed framework maintains low per-sample inference latency, which supports energy-efficient and scalable deployment. These results indicate that combining deep feature learning with interpretable tree-based models provides a practical and reliable solution for sustainable chest X-ray screening in real-world clinical environments.

## Linked entities

- **Diseases:** COVID-19 (MONDO:0100096), pneumonia (MONDO:0005249), tuberculosis (MONDO:0018076), lung cancer (MONDO:0005138)

## Full-text entities

- **Diseases:** COVID-19 (MESH:D000086382), Lung Disease (MESH:D008171), pneumonia (MESH:D011014), lung cancer (MESH:D008175), tuberculosis (MESH:D014376)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13024033/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC13024033/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/PMC13024033/full.md

---
Source: https://tomesphere.com/paper/PMC13024033