Semi-TCL: Semi-Supervised Track Contrastive Representation Learning
Wei Li, Yuanjun Xiong, Shuo Yang, Mingze Xu, Yongxin Wang, Wei Xia

TL;DR
Semi-TCL introduces a semi-supervised contrastive learning approach for multi-object tracking that leverages both labeled and unlabeled videos to improve appearance embedding, leading to better tracking performance.
Contribution
It proposes a novel instance-to-track matching objective within a contrastive learning framework, enabling semi-supervised learning for appearance embeddings in multi-object tracking.
Findings
Outperforms state-of-the-art methods on multiple tracking benchmarks.
Effectively learns discriminative appearance embeddings from unlabeled data.
Demonstrates robustness in semi-supervised learning scenarios.
Abstract
Online tracking of multiple objects in videos requires strong capacity of modeling and matching object appearances. Previous methods for learning appearance embedding mostly rely on instance-level matching without considering the temporal continuity provided by videos. We design a new instance-to-track matching objective to learn appearance embedding that compares a candidate detection to the embedding of the tracks persisted in the tracker. It enables us to learn not only from videos labeled with complete tracks, but also unlabeled or partially labeled videos. We implement this learning objective in a unified form following the spirit of constrastive loss. Experiments on multiple object tracking datasets demonstrate that our method can effectively learning discriminative appearance embeddings in a semi-supervised fashion and outperform state of the art methods on representative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Video Surveillance and Tracking Methods · Human Pose and Action Recognition
