FPGA-based Acceleration System for Visual Tracking
Ke Song, Chun Yuan, Peng Gao, Yunxu Sun

TL;DR
This paper presents a FPGA-based hardware system implementing a real-time visual tracking algorithm that achieves over 153 fps while reducing resource usage, suitable for practical applications.
Contribution
The paper introduces a novel FPGA implementation of DSST-based visual tracking that significantly improves speed and resource efficiency compared to traditional PC implementations.
Findings
Achieves over 153 frames per second in tracking.
Reduces resource occupation by 33% LUTs and 40% storage.
Demonstrates excellent performance in real-world tests.
Abstract
Visual tracking is one of the most important application areas of computer vision. At present, most algorithms are mainly implemented on PCs, and it is difficult to ensure real-time performance when applied in the real scenario. In order to improve the tracking speed and reduce the overall power consumption of visual tracking, this paper proposes a real-time visual tracking algorithm based on DSST(Discriminative Scale Space Tracking) approach. We implement a hardware system on Xilinx XC7K325T FPGA platform based on our proposed visual tracking algorithm. Our hardware system can run at more than 153 frames per second. In order to reduce the resource occupation, our system adopts the batch processing method in the feature extraction module. In the filter processing module, the FFT IP core is time-division multiplexed. Therefore, our hardware system utilizes LUTs and storage blocks of 33%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Visual Attention and Saliency Detection · Advanced Vision and Imaging
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
