Visual Object Tracking on Multi-modal RGB-D Videos: A Review
Xue-Feng Zhu, Tianyang Xu, Xiao-Jun Wu

TL;DR
This review paper summarizes the state of RGB-D visual object tracking, including datasets, methods, performance metrics, and future research directions, highlighting the advantages of multi-modal data over traditional RGB tracking.
Contribution
It provides a comprehensive overview of RGB-D tracking research, including benchmarking datasets, evaluation metrics, existing methods, and future challenges, which was lacking in prior reviews.
Findings
Summarized key RGB-D tracking datasets and performance benchmarks.
Reviewed existing RGB-D tracking algorithms and their effectiveness.
Discussed future research directions and challenges in RGB-D tracking.
Abstract
The development of visual object tracking has continued for decades. Recent years, as the wide accessibility of the low-cost RGBD sensors, the task of visual object tracking on RGB-D videos has drawn much attention. Compared to conventional RGB-only tracking, the RGB-D videos can provide more information that facilitates objecting tracking in some complicated scenarios. The goal of this review is to summarize the relative knowledge of the research filed of RGB-D tracking. To be specific, we will generalize the related RGB-D tracking benchmarking datasets as well as the corresponding performance measurements. Besides, the existing RGB-D tracking methods are summarized in the paper. Moreover, we discuss the possible future direction in the field of RGB-D tracking.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Visual Attention and Saliency Detection · Gaze Tracking and Assistive Technology
