Tracking without Label: Unsupervised Multiple Object Tracking via Contrastive Similarity Learning
Sha Meng, Dian Shao, Jiacheng Guo, Shan Gao

TL;DR
This paper introduces UCSL, an unsupervised contrastive learning approach for multiple object tracking that leverages feature consistency across frames to improve accuracy without relying on labels.
Contribution
It proposes a novel unsupervised contrastive framework with three modules to enhance object representation and association in MOT without supervision.
Findings
Outperforms existing unsupervised MOT methods.
Achieves higher accuracy than some fully supervised methods.
Effectively mitigates occlusion and ambiguity issues.
Abstract
Unsupervised learning is a challenging task due to the lack of labels. Multiple Object Tracking (MOT), which inevitably suffers from mutual object interference, occlusion, etc., is even more difficult without label supervision. In this paper, we explore the latent consistency of sample features across video frames and propose an Unsupervised Contrastive Similarity Learning method, named UCSL, including three contrast modules: self-contrast, cross-contrast, and ambiguity contrast. Specifically, i) self-contrast uses intra-frame direct and inter-frame indirect contrast to obtain discriminative representations by maximizing self-similarity. ii) Cross-contrast aligns cross- and continuous-frame matching results, mitigating the persistent negative effect caused by object occlusion. And iii) ambiguity contrast matches ambiguous objects with each other to further increase the certainty of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Image Enhancement Techniques · Air Quality Monitoring and Forecasting
