# Learning Cascaded Siamese Networks for High Performance Visual Tracking

**Authors:** Peng Gao, Yipeng Ma, Ruyue Yuan, Liyi Xiao, Fei Wang

arXiv: 1905.02857 · 2019-05-09

## TL;DR

This paper introduces a cascaded Siamese network for visual tracking that combines offline-trained matching and online-updated classification subnetworks, achieving high accuracy in challenging scenarios.

## Contribution

The novel cascaded Siamese network architecture integrates matching and classification subnetworks with an effective update method, advancing visual tracking performance.

## Key findings

- Achieves state-of-the-art results on benchmark datasets.
- Effectively handles negative scenarios in visual tracking.
- Online classification subnetwork improves target-specific adaptation.

## Abstract

Visual tracking is one of the most challenging computer vision problems. In order to achieve high performance visual tracking in various negative scenarios, a novel cascaded Siamese network is proposed and developed based on two different deep learning networks: a matching subnetwork and a classification subnetwork. The matching subnetwork is a fully convolutional Siamese network. According to the similarity score between the exemplar image and the candidate image, it aims to search possible object positions and crop scaled candidate patches. The classification subnetwork is designed to further evaluate the cropped candidate patches and determine the optimal tracking results based on the classification score. The matching subnetwork is trained offline and fixed online, while the classification subnetwork performs stochastic gradient descent online to learn more target-specific information. To improve the tracking performance further, an effective classification subnetwork update method based on both similarity and classification scores is utilized for updating the classification subnetwork. Extensive experimental results demonstrate that our proposed approach achieves state-of-the-art performance in recent benchmarks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.02857/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1905.02857/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/1905.02857/full.md

---
Source: https://tomesphere.com/paper/1905.02857