Integration of Regularized l1 Tracking and Instance Segmentation for Video Object Tracking
Filiz Gurkan, Bilge Gunsel

TL;DR
This paper presents a novel video object tracking method combining deep detection, sparse dictionary representation, and a new state model to improve robustness against occlusion, pose, and scale variations, outperforming current state-of-the-art trackers.
Contribution
It introduces an integrated tracking-by-detection framework with a regularized sparse dictionary and a new state vector for deformation, enhancing robustness and accuracy.
Findings
Achieves 11% and 9% improvement in success rate over state-of-the-art on VOT datasets.
Demonstrates robustness to occlusion, pose changes, and scale variations.
Outperforms existing trackers on challenging benchmarks.
Abstract
We introduce a tracking-by-detection method that integrates a deep object detector with a particle filter tracker under the regularization framework where the tracked object is represented by a sparse dictionary. A novel observation model which establishes consensus between the detector and tracker is formulated that enables us to update the dictionary with the guidance of the deep detector. This yields an efficient representation of the object appearance through the video sequence hence improves robustness to occlusion and pose changes. Moreover we propose a new state vector consisting of translation, rotation, scaling and shearing parameters that allows tracking the deformed object bounding boxes hence significantly increases robustness to scale changes. Numerical results reported on challenging VOT2016 and VOT2018 benchmarking data sets demonstrate that the introduced tracker,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
