Learning Dynamic Compact Memory Embedding for Deformable Visual Object Tracking
Pengfei Zhu, Hongtao Yu, Kaihua Zhang, Yu Wang, Shuai Zhao, Lei Wang,, Tianzhu Zhang, Qinghua Hu

TL;DR
This paper introduces a dynamic memory embedding approach to improve deformable visual object tracking by enhancing discrimination and segmentation accuracy, especially under severe deformation and distractors.
Contribution
It proposes a novel dynamic compact memory embedding and point-to-global matching strategy for segmentation-based deformable object tracking, addressing limitations of existing methods.
Findings
Outperforms recent trackers on six challenging benchmarks.
Achieves superior segmentation accuracy over D3S and SiamMask.
Effectively handles severe deformable variations and distractors.
Abstract
Recently, template-based trackers have become the leading tracking algorithms with promising performance in terms of efficiency and accuracy. However, the correlation operation between query feature and the given template only exploits accurate target localization, leading to state estimation error especially when the target suffers from severe deformable variations. To address this issue, segmentation-based trackers have been proposed that employ per-pixel matching to improve the tracking performance of deformable objects effectively. However, most of existing trackers only refer to the target features in the initial frame, thereby lacking the discriminative capacity to handle challenging factors, e.g., similar distractors, background clutter, appearance change, etc. To this end, we propose a dynamic compact memory embedding to enhance the discrimination of the segmentation-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Visual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques
