Parallel Tracking and Verifying: A Framework for Real-Time and High Accuracy Visual Tracking
Heng Fan, Haibin Ling

TL;DR
This paper introduces PTAV, a parallel tracking and verifying framework that combines real-time speed with high accuracy by using separate threads for tracking and verification, improving performance on standard benchmarks.
Contribution
The novel PTAV framework leverages parallel processing with a tracker and verifier working asynchronously, enhancing real-time tracking accuracy beyond existing methods.
Findings
Achieves state-of-the-art accuracy among real-time trackers
Outperforms many deep learning-based solutions in benchmarks
Flexible framework with potential for further improvements
Abstract
Being intensively studied, visual tracking has seen great recent advances in either speed (e.g., with correlation filters) or accuracy (e.g., with deep features). Real-time and high accuracy tracking algorithms, however, remain scarce. In this paper we study the problem from a new perspective and present a novel parallel tracking and verifying (PTAV) framework, by taking advantage of the ubiquity of multi-thread techniques and borrowing from the success of parallel tracking and mapping in visual SLAM. Our PTAV framework typically consists of two components, a tracker T and a verifier V, working in parallel on two separate threads. The tracker T aims to provide a super real-time tracking inference and is expected to perform well most of the time; by contrast, the verifier V checks the tracking results and corrects T when needed. The key innovation is that, V does not work on every frame…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Vision and Imaging · Image Enhancement Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
