# Real-Time Robust 2.5D Stereo Multi-Object Tracking with Lightweight Stereo Matching Algorithm

**Authors:** Jinhyeong Lee, Junyoung Shin, Eunwoo Park, Daekeun Kim

PMC · DOI: 10.3390/s25216773 · Sensors (Basel, Switzerland) · 2025-11-05

## TL;DR

A lightweight stereo tracking system achieves high accuracy and real-time performance for multi-object tracking using geometric constraints and stereo vision.

## Contribution

A dual-tracker design with stereo matching using only bounding box coordinates improves tracking accuracy and efficiency.

## Key findings

- StereoSORT achieves a MOTA of 0.932 and IDF1 of 0.823, outperforming monocular trackers.
- The system runs at 70 FPS on standard hardware with a median depth error of 50.1 mm.
- Stereo vision redundancy improves tracking consistency during occlusions and truncations.

## Abstract

What are the main findings?
Lightweight stereo matching using only bounding box coordinates achieves robust multi-object tracking with a MOTA of 0.932 and an IDF1 of 0.823, outperforming state-of-the-art monocular trackers.A dual-tracker design with a re-identification mechanism maintains consistent object identities during occlusions and truncations by leveraging stereo redundancy.

Lightweight stereo matching using only bounding box coordinates achieves robust multi-object tracking with a MOTA of 0.932 and an IDF1 of 0.823, outperforming state-of-the-art monocular trackers.

A dual-tracker design with a re-identification mechanism maintains consistent object identities during occlusions and truncations by leveraging stereo redundancy.

What are the implications of the main findings?
Resource-efficient 2.5D tracking enables real-time deployment (70 FPS) on standard hardware without expensive 3D reconstruction or dense stereo matching.Stereo vision’s inherent redundancy provides a practical solution for robust tracking in challenging real-world scenarios like retail monitoring and autonomous systems.

Resource-efficient 2.5D tracking enables real-time deployment (70 FPS) on standard hardware without expensive 3D reconstruction or dense stereo matching.

Stereo vision’s inherent redundancy provides a practical solution for robust tracking in challenging real-world scenarios like retail monitoring and autonomous systems.

Multi-object tracking faces persistent challenges from occlusions and truncations in monocular vision systems. While stereo vision provides depth information, existing approaches require computationally expensive dense matching or 3D reconstruction. This paper presents a real-time 2.5D stereo multi-object tracking framework combining lightweight stereo matching with resilient tracker management. The stereo matching module employs Direct Linear Transform-based triangulation using only bounding box coordinates, eliminating costly feature extraction while maintaining robust correspondence through geometric constraints. A dual-tracker architecture maintains independent trackers in both views, enabling re-identification when objects become occluded in one view but remain visible in the other. Experimental validation on a refrigerator monitoring dataset demonstrates that StereoSORT achieves a multiple object tracking accuracy (MOTA) of 0.932 and an identification F1 score (IDF1) of 0.823, substantially outperforming monocular trackers, including OC-SORT (IDF1: 0.765) and ByteTrack (IDF1: 0.609). The system achieves a 50.1 mm median depth error, comparable to commercial sensors, while maintaining 70 FPS on standard hardware. These results validate that geometric constraints alone enable robust stereo tracking without appearance features, offering a practical solution for resource-constrained environments where computational efficiency and tracking reliability are equally critical.

## Full-text entities

- **Diseases:** MOT (MESH:D014012), injury to (MESH:D014947), ID (MESH:C537985), occlusions (MESH:D001157)
- **Chemicals:** IoU (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12610995/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12610995/full.md

## References

58 references — full list in the complete paper: https://tomesphere.com/paper/PMC12610995/full.md

---
Source: https://tomesphere.com/paper/PMC12610995