ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every Detection Box
Yifu Zhang, Xinggang Wang, Xiaoqing Ye, Wei Zhang, Jincheng Lu, Xiao, Tan, Errui Ding, Peize Sun, Jingdong Wang

TL;DR
ByteTrackV2 introduces a hierarchical data association method for improved multi-object tracking in 2D and 3D videos, effectively handling detection score fluctuations and incorporating velocity predictions, leading to state-of-the-art results.
Contribution
It proposes a generic data association strategy that enhances multi-object tracking by mining true objects from low-score detections and integrates velocity-based motion prediction for 3D scenarios.
Findings
Achieves top performance on nuScenes 3D MOT leaderboard.
Effectively handles detection score fluctuations and object disappearance.
Compatible with various detectors and applicable in real-world scenarios.
Abstract
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects across video frames. Detection boxes serve as the basis of both 2D and 3D MOT. The inevitable changing of detection scores leads to object missing after tracking. We propose a hierarchical data association strategy to mine the true objects in low-score detection boxes, which alleviates the problems of object missing and fragmented trajectories. The simple and generic data association strategy shows effectiveness under both 2D and 3D settings. In 3D scenarios, it is much easier for the tracker to predict object velocities in the world coordinate. We propose a complementary motion prediction strategy that incorporates the detected velocities with a Kalman filter to address the problem of abrupt motion and short-term disappearing. ByteTrackV2 leads the nuScenes 3D MOT leaderboard in both camera (56.4%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Infrared Target Detection Methodologies · Advanced Image and Video Retrieval Techniques
