Temporal Correlation Meets Embedding: Towards a 2nd Generation of JDE-based Real-Time Multi-Object Tracking
Yunfei Zhang, Chao Liang, Jin Gao, Zhipeng Zhang, Weiming Hu, Stephen, Maybank, Xue Zhou, Liang Li

TL;DR
This paper introduces TCBTrack, a real-time multi-object tracking method that leverages temporal correlation through cross-correlation learning, achieving state-of-the-art results by enhancing feature discriminability and motion understanding.
Contribution
The paper proposes a novel learning approach using cross-correlation to incorporate temporal information into a lightweight feature extractor, improving MOT performance and robustness.
Findings
Achieves state-of-the-art results on MOT17, MOT20, and DanceTrack datasets.
Balances speed, robustness, and accuracy effectively.
Outperforms existing online trackers in real-time multi-object tracking.
Abstract
Joint Detection and Embedding (JDE) trackers have demonstrated excellent performance in Multi-Object Tracking (MOT) tasks by incorporating the extraction of appearance features as auxiliary tasks through embedding Re-Identification task (ReID) into the detector, achieving a balance between inference speed and tracking performance. However, solving the competition between the detector and the feature extractor has always been a challenge. Meanwhile, the issue of directly embedding the ReID task into MOT has remained unresolved. The lack of high discriminability in appearance features results in their limited utility. In this paper, a new learning approach using cross-correlation to capture temporal information of objects is proposed. The feature extraction network is no longer trained solely on appearance features from each frame but learns richer motion features by utilizing feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTarget Tracking and Data Fusion in Sensor Networks · Video Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
