Evaluation of trackers for Pan-Tilt-Zoom Scenarios
Yucao Tang, Guillaume-Alexandre Bilodeau

TL;DR
This paper evaluates various tracking algorithms for Pan-Tilt-Zoom cameras using a virtual framework, highlighting the importance of speed and robustness, and exploring target prediction to improve long-term tracking performance.
Contribution
The study introduces a virtual PTZ framework for comprehensive evaluation of trackers and extends it with target prediction to enhance long-term tracking robustness.
Findings
Speed and robustness are crucial for PTZ tracking.
Target prediction can improve long-term tracking stability.
Evaluation framework effectively compares tracker performance in dynamic scenarios.
Abstract
Tracking with a Pan-Tilt-Zoom (PTZ) camera has been a research topic in computer vision for many years. Compared to tracking with a still camera, the images captured with a PTZ camera are highly dynamic in nature because the camera can perform large motion resulting in quickly changing capture conditions. Furthermore, tracking with a PTZ camera involves camera control to position the camera on the target. For successful tracking and camera control, the tracker must be fast enough, or has to be able to predict accurately the next position of the target. Therefore, standard benchmarks do not allow to assess properly the quality of a tracker for the PTZ scenario. In this work, we use a virtual PTZ framework to evaluate different tracking algorithms and compare their performances. We also extend the framework to add target position prediction for the next frame, accounting for camera motion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Vision and Imaging · Image and Video Quality Assessment
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
