# A Study on Bus Passenger Boarding and Alighting Detection and Recognition Based on Video Images and YOLO Algorithm

**Authors:** Wei Xu, Yushan Zhao, Xiaodong Du, Haoyang Ji, Lei Xing

PMC · DOI: 10.3390/s26051418 · Sensors (Basel, Switzerland) · 2026-02-24

## TL;DR

This paper improves bus passenger detection using a modified YOLO algorithm to better capture boarding and alighting behaviors from video images, supporting smart city transportation.

## Contribution

The paper introduces a modified YOLOv8n with DAC2f, SWD-PAN, and WIoUv3 for improved detection of bus passengers in challenging on-vehicle scenarios.

## Key findings

- The modified YOLOv8n achieved 3.68% higher precision, 5.12% higher recall, and 6.26% higher mAP compared to the baseline.
- Integration with DeepSORT improved tracking stability, achieving MOTA of 31.24% and MOTP of 88.06%.
- The approach addresses limitations in traditional OD data collection for intelligent public transportation.

## Abstract

Public transportation is the core of easing urban traffic congestion, reducing pollution and advancing smart city transportation intellectualization. Its refined operation relies heavily on accurate, real-time passenger origin–destination (OD) data. However, traditional manual surveys are costly with low sampling rates, while smart card big data lacks alighting information and has deviations, failing to reflect real travel behaviors and becoming a bottleneck for intelligent public transportation development. To address this, this paper proposes a bus passenger boarding/alighting detection and recognition study based on video images and the YOLO algorithm. Aiming at traditional YOLO’s shortcomings in on-vehicle scenarios (insufficient feature extraction, inefficient feature fusion, slow convergence), the baseline YOLOv8n is improved for bus scenarios’ high-density, high-occlusion and variable-target scales: (1) DAC2f structure (deformable attention + C2f) captures occluded passengers’ core features and suppresses background interference; (2) SWD-PAN enables bidirectional cross-scale feature interaction to adapt to scale differences; and (3) WIoUv3 balances sample weights for small targets and non-standard posture passengers. Experiments show that precision, recall and mAP increase by 3.68%, 5.12% and 6.26%, respectively, meeting real-time requirements. The improved YOLOv8 is deeply integrated with DeepSORT to enhance tracking stability. Tests show that MOTA reaches 31.24% (2.6% higher than YOLOv8n, 16.4% higher than YOLO-X) and MOTP reaches 88.06%, solving trajectory breakage and ID switching. This addresses traditional OD data collection pain points, providing technical support for intelligent public transportation refined management and smart city transportation optimization.

## Full-text entities

- **Diseases:** pain (MESH:D010146)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12987156/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12987156/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/PMC12987156/full.md

---
Source: https://tomesphere.com/paper/PMC12987156