SimpleTrack: Rethinking and Improving the JDE Approach for Multi-Object Tracking
Jiaxin Li, Yan Ding, Hualiang Wei

TL;DR
SimpleTrack introduces a novel association matrix and a bottom-up fusion method to enhance multi-object tracking, significantly improving accuracy and speed over existing JDE-based methods.
Contribution
The paper proposes the EG association matrix and a simple, effective tracker called SimpleTrack, advancing data association and re-identification in JDE-based multi-object tracking.
Findings
Achieved 61.6 HOTA and 76.3 IDF1 on MOT17.
Improved IDF1, HOTA, and IDsw metrics across five SOTA JDE methods.
Increased tracking speed by approximately 20%.
Abstract
Joint detection and embedding (JDE) based methods usually estimate bounding boxes and embedding features of objects with a single network in Multi-Object Tracking (MOT). In the tracking stage, JDE-based methods fuse the target motion information and appearance information by applying the same rule, which could fail when the target is briefly lost or blocked. To overcome this problem, we propose a new association matrix, the Embedding and Giou matrix, which combines embedding cosine distance and Giou distance of objects. To further improve the performance of data association, we develop a simple, effective tracker named SimpleTrack, which designs a bottom-up fusion method for Re-identity and proposes a new tracking strategy based on our EG matrix. The experimental results indicate that SimpleTrack has powerful data association capability, e.g., 61.6 HOTA and 76.3 IDF1 on MOT17. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Infrared Target Detection Methodologies · Advanced Chemical Sensor Technologies
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
