# ESA-YOLOv5m: a lightweight spatial and improved attention-driven detection for brain tumor MRI analysis

**Authors:** Maram Fahaad Almufareh, Noshina Tariq, Mamoona Humayun, Haya Aldossary, Meshal Alharbi

PMC · DOI: 10.3389/fmed.2025.1733180 · Frontiers in Medicine · 2025-12-19

## TL;DR

This paper introduces ESA-YOLOv5m, a lightweight model that improves brain tumor detection in MRI scans using spatial attention, achieving high accuracy and fast performance.

## Contribution

The novel integration of an Enhanced Spatial Attention module after the SPPF layer in YOLOv5m improves tumor detection accuracy with minimal computational overhead.

## Key findings

- ESA-YOLOv5m achieved 90% precision, 90% recall, and 91% mAP@0.5, outperforming the baseline by 11–12%.
- Placing the ESA module after the SPPF layer yielded the highest performance (mAP@0.5 = 0.91).
- The model maintained efficiency with less than 4.3% increased parameters and sub-10 ms latency per image.

## Abstract

The early and accurate detection of brain tumors is vital for improving patient outcomes, enabling timely clinical interventions, and reducing diagnostic uncertainty. Despite advances in deep learning, conventional Convolutional Neural Network (CNN)-based models often struggle with small or low-contrast tumors. They also remain computationally demanding for real-time clinical deployment.

This study presents an Enhanced Spatial Attention (ESA)-integrated You Only Look Once v5 medium (YOLOv5m) architecture, a lightweight and efficient framework for brain tumor detection in MRI scans. The ESA module, positioned after the Spatial Pyramid Pooling-Fast (SPPF) layer, enhances feature discrimination by emphasizing diagnostically relevant regions while suppressing background noise, thereby improving localization accuracy without increasing computational complexity. Experiments were conducted on the Figshare brain tumor MRI dataset containing three tumor classes: glioma, meningioma, and pituitary.

ESA-YOLOv5m achieved a Precision of 90%, Recall of 90%, and mean Average Precision (mAP)@0.5 of 91%, surpassing the baseline YOLOv5m by approximately 11%–12%. An ablation study further confirmed that placing the ESA module after the SPPF layer yields the highest performance (mAP@0.5 = 0.91), while earlier integration produced marginally lower results. Classwise analyses demonstrated consistent gains (mAP range 0.87–0.98), and fivefold cross-validation showed stable performance (mAP@0.5 = 0.910 ± 0.006). Efficiency tests revealed negligible overhead, with less than a 4.3% increase in parameters and an average latency below 10 ms per image.

Overall, the results validate that integrating a lightweight spatial attention mechanism significantly enhances tumor localization and model generalization while preserving real-time inference. The proposed ESA-YOLOv5m framework provides a reliable and scalable solution for automated brain tumor detection, suitable for clinical decision-support systems and edge healthcare applications.

## Linked entities

- **Diseases:** brain tumor (MONDO:0021211), glioma (MONDO:0021042), meningioma (MONDO:0003057), pituitary (MONDO:0021156)

## Full-text entities

- **Diseases:** glioma (MESH:D005910), meningioma (MESH:D008579), tumor (MESH:D009369), brain tumor (MESH:D001932)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12757366/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12757366/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/PMC12757366/full.md

---
Source: https://tomesphere.com/paper/PMC12757366