StrongSORT: Make DeepSORT Great Again
Yunhao Du, Zhicheng Zhao, Yang Song, Yanyun Zhao, Fei Su, Tao Gong,, Hongying Meng

TL;DR
StrongSORT significantly enhances DeepSORT for multi-object tracking by improving detection, embedding, and association, and introduces lightweight algorithms AFLink and GSI to address missing data issues, achieving state-of-the-art results.
Contribution
The paper presents StrongSORT, a robust baseline for MOT, and introduces AFLink and GSI algorithms that improve association and detection with minimal computational overhead.
Findings
Achieves state-of-the-art results on MOT17, MOT20, DanceTrack, and KITTI benchmarks.
Provides a lightweight, plug-and-play solution for missing association and detection problems.
Demonstrates improved speed-accuracy trade-offs in multi-object tracking.
Abstract
Recently, Multi-Object Tracking (MOT) has attracted rising attention, and accordingly, remarkable progresses have been achieved. However, the existing methods tend to use various basic models (e.g, detector and embedding model), and different training or inference tricks, etc. As a result, the construction of a good baseline for a fair comparison is essential. In this paper, a classic tracker, i.e., DeepSORT, is first revisited, and then is significantly improved from multiple perspectives such as object detection, feature embedding, and trajectory association. The proposed tracker, named StrongSORT, contributes a strong and fair baseline for the MOT community. Moreover, two lightweight and plug-and-play algorithms are proposed to address two inherent "missing" problems of MOT: missing association and missing detection. Specifically, unlike most methods, which associate short tracklets…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Gaussian Process
