Collecting Consistently High Quality Object Tracks with Minimal Human Involvement by Using Self-Supervised Learning to Detect Tracker Errors
Samreen Anjum, Suyog Jain, and Danna Gurari

TL;DR
This paper introduces a hybrid self-supervised learning framework that enhances object tracking quality with minimal human intervention by detecting tracker errors and re-localizing objects, adaptable to new categories.
Contribution
It presents a novel self-supervised approach for dataset-specific tracker error detection and correction, reducing reliance on labeled data and improving tracking performance.
Findings
Outperforms existing methods on three datasets.
Effective for small, fast-moving, or occluded objects.
Requires minimal human input for high-quality tracking.
Abstract
We propose a hybrid framework for consistently producing high-quality object tracks by combining an automated object tracker with little human input. The key idea is to tailor a module for each dataset to intelligently decide when an object tracker is failing and so humans should be brought in to re-localize an object for continued tracking. Our approach leverages self-supervised learning on unlabeled videos to learn a tailored representation for a target object that is then used to actively monitor its tracked region and decide when the tracker fails. Since labeled data is not needed, our approach can be applied to novel object categories. Experiments on three datasets demonstrate our method outperforms existing approaches, especially for small, fast moving, or occluded objects.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Face and Expression Recognition · Industrial Vision Systems and Defect Detection
