SiamVGG: Visual Tracking using Deeper Siamese Networks
Yuhong Li, Xiaofan Zhang, Deming Chen

TL;DR
SiamVGG is a deep Siamese network-based visual tracker that combines a VGG-16 backbone with cross-correlation to achieve high accuracy and real-time performance at 50 FPS, outperforming some state-of-the-art methods.
Contribution
The paper introduces SiamVGG, a novel deep Siamese network architecture based on VGG-16 that balances high accuracy with real-time tracking capabilities.
Findings
Achieves state-of-the-art accuracy on multiple datasets.
Runs at 50 FPS on GTX 1080Ti.
Outperforms ECO and C-COT in VOT2017 with 2% higher EAO.
Abstract
Recently, we have seen a rapid development of Deep Neural Network (DNN) based visual tracking solutions. Some trackers combine the DNN-based solutions with Discriminative Correlation Filters (DCF) to extract semantic features and successfully deliver the state-of-the-art tracking accuracy. However, these solutions are highly compute-intensive, which require long processing time, resulting unsecured real-time performance. To deliver both high accuracy and reliable real-time performance, we propose a novel tracker called SiamVGG\footnote{https://github.com/leeyeehoo/SiamVGG}. It combines a Convolutional Neural Network (CNN) backbone and a cross-correlation operator, and takes advantage of the features from exemplary images for more accurate object tracking. The architecture of SiamVGG is customized from VGG-16 with the parameters shared by both exemplary images and desired input video…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Vision and Imaging · Image Enhancement Techniques
MethodsThe Educational Competition Optimizer
