Fully-Convolutional Siamese Networks for Object Tracking
Luca Bertinetto, Jack Valmadre, Jo\~ao F. Henriques, Andrea Vedaldi,, Philip H. S. Torr

TL;DR
This paper introduces a fully-convolutional Siamese network for object tracking that operates in real-time and achieves state-of-the-art results by leveraging deep learning trained on large datasets.
Contribution
It presents a novel end-to-end trained Siamese network for object tracking that is both fast and highly effective, improving over previous online-only models.
Findings
Operates at frame-rates beyond real-time
Achieves state-of-the-art performance on multiple benchmarks
Simplifies the tracking process with a fully-convolutional architecture
Abstract
The problem of arbitrary object tracking has traditionally been tackled by learning a model of the object's appearance exclusively online, using as sole training data the video itself. Despite the success of these methods, their online-only approach inherently limits the richness of the model they can learn. Recently, several attempts have been made to exploit the expressive power of deep convolutional networks. However, when the object to track is not known beforehand, it is necessary to perform Stochastic Gradient Descent online to adapt the weights of the network, severely compromising the speed of the system. In this paper we equip a basic tracking algorithm with a novel fully-convolutional Siamese network trained end-to-end on the ILSVRC15 dataset for object detection in video. Our tracker operates at frame-rates beyond real-time and, despite its extreme simplicity, achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Fire Detection and Safety Systems · Advanced Neural Network Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Siamese Network
