Fast Dynamic Convolutional Neural Networks for Visual Tracking
Zhiyan Cui, Na Lu

TL;DR
This paper introduces a fast visual tracking algorithm combining CNN-based MDNet and RoIAlign, significantly improving speed while maintaining high precision, suitable for real-time applications.
Contribution
The paper proposes a novel combination of RoIAlign with MDNet to accelerate CNN-based tracking without sacrificing accuracy.
Findings
Achieved 7x speedup over MDNet
Maintained high tracking precision at around 7 fps
Validated on OTB100 and VOT2016 benchmarks
Abstract
Most of the existing tracking methods based on CNN(convolutional neural networks) are too slow for real-time application despite the excellent tracking precision compared with the traditional ones. In this paper, a fast dynamic visual tracking algorithm combining CNN based MDNet(Multi-Domain Network) and RoIAlign was developed. The major problem of MDNet also lies in the time efficiency. Considering the computational complexity of MDNet is mainly caused by the large amount of convolution operations and fine-tuning of the network during tracking, a RoIPool layer which could conduct the convolution over the whole image instead of each RoI is added to accelerate the convolution and a new strategy of fine-tuning the fully-connected layers is used to accelerate the update. With RoIPool employed, the computation speed has been increased but the tracking precision has dropped simultaneously.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Vision and Imaging · Infrared Target Detection Methodologies
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · RoIAlign · RoIPool · Convolution
