Collaborative Tracking Learning for Frame-Rate-Insensitive Multi-Object Tracking
Yiheng Liu, Junta Wu, Yi Fu

TL;DR
This paper introduces ColTrack, a collaborative learning approach for multi-object tracking that maintains high performance at low frame rates, reducing computational costs and improving efficiency on edge devices.
Contribution
The paper proposes a novel end-to-end query-based method with historical queries, an information refinement module, and a tracking object consistency loss for frame-rate-insensitive MOT.
Findings
Outperforms state-of-the-art on Dancetrack and BDD100K datasets.
Achieves higher accuracy than existing methods on MOT17.
Maintains high performance at reduced frame rates, enabling faster processing.
Abstract
Multi-object tracking (MOT) at low frame rates can reduce computational, storage and power overhead to better meet the constraints of edge devices. Many existing MOT methods suffer from significant performance degradation in low-frame-rate videos due to significant location and appearance changes between adjacent frames. To this end, we propose to explore collaborative tracking learning (ColTrack) for frame-rate-insensitive MOT in a query-based end-to-end manner. Multiple historical queries of the same target jointly track it with richer temporal descriptions. Meanwhile, we insert an information refinement module between every two temporal blocking decoders to better fuse temporal clues and refine features. Moreover, a tracking object consistency loss is proposed to guide the interaction between historical queries. Extensive experimental results demonstrate that in high-frame-rate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Image Enhancement Techniques · Image and Video Quality Assessment
