Learning Association via Track-Detection Matching for Multi-Object Tracking
Momir Ad\v{z}emovi\'c

TL;DR
This paper introduces TDLP, a data-driven, modular, and efficient link prediction method for multi-object tracking that outperforms existing heuristics and end-to-end approaches across multiple benchmarks.
Contribution
The paper presents a novel link prediction approach for data association in multi-object tracking, combining efficiency with learned association without handcrafted rules.
Findings
TDLP surpasses state-of-the-art methods on multiple benchmarks.
Link prediction outperforms metric learning for heterogeneous features.
The approach is modular and computationally efficient.
Abstract
Multi-object tracking aims to maintain object identities over time by associating detections across video frames. Two dominant paradigms exist in literature: tracking-by-detection methods, which are computationally efficient but rely on handcrafted association heuristics, and end-to-end approaches, which learn association from data at the cost of higher computational complexity. We propose Track-Detection Link Prediction (TDLP), a tracking-by-detection method that performs per-frame association via link prediction between tracks and detections, i.e., by predicting the correct continuation of each track at every frame. TDLP is architecturally designed primarily for geometric features such as bounding boxes, while optionally incorporating additional cues, including pose and appearance. Unlike heuristic-based methods, TDLP learns association directly from data without handcrafted rules,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Gaze Tracking and Assistive Technology
