# Pneumonia and pneumothorax detection: A multi-factor evaluation of chest X-rays

**Authors:** Yousef Saad Aldabayan

PMC · DOI: 10.1371/journal.pone.0341060 · PLOS One · 2026-01-20

## TL;DR

This paper introduces a Vision Transformer system for detecting pneumonia and pneumothorax in chest X-rays, improving accuracy and interpretability in medical imaging.

## Contribution

The novel contribution is a medical imaging system using ViT with radiograph-specific augmentation and class imbalance handling for better diagnostic accuracy.

## Key findings

- ViT models achieved 70−75% accuracy and 0.63–0.71 AUC for pneumonia and pneumothorax detection.
- Medical-focused data preparation and training approaches significantly improved ViT performance.
- Attention-based visualization enhanced interpretability by highlighting important radiological areas.

## Abstract

The research creates a Vision Transformer (ViT) diagnostic system which identifies pneumonia and pneumothorax from chest radiographs through analysis of the NIH ChestX-ray14 dataset. The research methodology solves medical imaging problems through three essential components which include (i) radiograph-specific augmentation for simulating authentic imaging conditions and (ii) multi-label imbalance handling through WeightedRandomSampler with class-specific weight application to stop all-normal predictions and (iii) optimization improvements that include CosineAnnealingWarmRestarts scheduling and sigmoid-based classification head optimization and disease-specific threshold optimization. The evaluation of model performance uses AUC and sensitivity and specificity and precision and F1-score because accuracy proves ineffective when dealing with severe class imbalances. The ViT models achieve 70−75% accuracy and 0.63–0.71 AUC values for both target conditions during non-leaking and noise-aware experiments because of the weak labels and restricted supervision in the ChestX-ray14 dataset. The system enhances its ability to detect rare conditions while providing better interpretability through Vision Transformer attention-based visualization of important radiological areas. The research demonstrates that ViT performance improves significantly through medical-focused data preparation methods and training approaches which demonstrate potential for radiology assistance in high-volume and resource-constrained environments.

## Linked entities

- **Diseases:** pneumonia (MONDO:0005249), pneumothorax (MONDO:0002076)

## Full-text entities

- **Diseases:** Pneumonia (MESH:D011014), pneumothorax (MESH:D011030)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12818600/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12818600/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/PMC12818600/full.md

---
Source: https://tomesphere.com/paper/PMC12818600