Efficient Multi-level Correlating for Visual Tracking
Yipeng Ma, Chun Yuan, Peng Gao, Fei Wang

TL;DR
This paper introduces MLCFT, a multi-level correlation filter tracking method that combines deep and shallow CNN features with a two-stage detection scheme, achieving high accuracy and real-time speed.
Contribution
The paper proposes a novel multi-level CF tracking approach with a cascaded detection scheme and entropy-based feature fusion, improving speed and accuracy over existing methods.
Findings
Outperforms state-of-the-art trackers on benchmarks.
Achieves tracking speed exceeding 16 fps.
Effectively prevents model drift with two-stage detection.
Abstract
Correlation filter (CF) based tracking algorithms have demonstrated favorable performance recently. Nevertheless, the top performance trackers always employ complicated optimization methods which constraint their real-time applications. How to accelerate the tracking speed while retaining the tracking accuracy is a significant issue. In this paper, we propose a multi-level CF-based tracking approach named MLCFT which further explores the potential capacity of CF with two-stage detection: primal detection and oriented re-detection. The cascaded detection scheme is simple but competent to prevent model drift and accelerate the speed. An effective fusion method based on relative entropy is introduced to combine the complementary features extracted from deep and shallow layers of convolutional neural networks (CNN). Moreover, a novel online model update strategy is utilized in our tracker,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Image Enhancement Techniques · Advanced Vision and Imaging
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
