A Strong Feature Representation for Siamese Network Tracker
Zhipeng Zhou, Rui Zhang, Dong Yin

TL;DR
This paper introduces SiamPF, a Siamese network tracker with a stronger feature representation using a modified VGG16 backbone, an added AlexNet-like branch, and a channel attention block, achieving high accuracy and real-time performance.
Contribution
It proposes a novel Siamese tracker with an enhanced feature extraction method combining VGG16 and AlexNet-like features, improving accuracy over shallow models.
Findings
Achieved high accuracy on multiple benchmarks.
Maintained real-time tracking at 41 FPS.
Outperformed state-of-the-art algorithms.
Abstract
Object tracking has important application in assistive technologies for personalized monitoring. Recent trackers choosing AlexNet as their backbone to extract features have gained great success. However, AlexNet is too shallow to form a strong feature representation, the tracker based on the Siamese network have an accuracy gap compared with state-of-the-art algorithms. To solve this problem, this paper proposes a tracker called SiamPF. Firstly, the modified pre-trained VGG16 network is fine-tuned as the backbone. Secondly, an AlexNet-like branch is added after the third convolutional layer and merged with the response map of the backbone network to form a preliminary strong feature representation. And then, a channel attention block is designed to adaptively select the contribution features. Finally, the APCE is modified to process the response map to reduce interference and focus the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · IoT-based Smart Home Systems · Fire Detection and Safety Systems
Methods1x1 Convolution · Convolution · Siamese Network · Local Response Normalization · Grouped Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dense Connections · Max Pooling · Softmax
