YOLO11-JDE: Fast and Accurate Multi-Object Tracking with Self-Supervised Re-ID
I\~naki Erregue, Kamal Nasrollahi, Sergio Escalera

TL;DR
YOLO11-JDE is a real-time multi-object tracking system that combines detection and self-supervised Re-ID, achieving high accuracy and efficiency without needing labeled identity data.
Contribution
The paper presents YOLO11-JDE, a novel multi-object tracking approach that integrates self-supervised Re-ID into YOLO11, reducing data labeling costs and improving speed and accuracy.
Findings
Outperforms existing JDE methods on MOT benchmarks
Runs at higher FPS with fewer parameters
Eliminates need for labeled identity datasets
Abstract
We introduce YOLO11-JDE, a fast and accurate multi-object tracking (MOT) solution that combines real-time object detection with self-supervised Re-Identification (Re-ID). By incorporating a dedicated Re-ID branch into YOLO11s, our model performs Joint Detection and Embedding (JDE), generating appearance features for each detection. The Re-ID branch is trained in a fully self-supervised setting while simultaneously training for detection, eliminating the need for costly identity-labeled datasets. The triplet loss, with hard positive and semi-hard negative mining strategies, is used for learning discriminative embeddings. Data association is enhanced with a custom tracking implementation that successfully integrates motion, appearance, and location cues. YOLO11-JDE achieves competitive results on MOT17 and MOT20 benchmarks, surpassing existing JDE methods in terms of FPS and using up to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Chemical Sensor Technologies · Advanced Image and Video Retrieval Techniques
