Learning Dual-Fused Modality-Aware Representations for RGBD Tracking
Shang Gao, Jinyu Yang, Zhe Li, Feng Zheng, Ale\v{s}, Leonardis, Jingkuan Song

TL;DR
This paper introduces DMTracker, a novel RGBD tracking method that fuses shared and modality-specific features using cross-modal attention to improve robustness in complex scenes.
Contribution
The paper proposes a dual-fused, modality-aware framework that effectively combines shared and specific features for enhanced RGBD object tracking.
Findings
Achieves superior performance on challenging RGBD benchmarks.
Effectively fuses shared and modality-specific information.
Demonstrates robustness in complex tracking scenarios.
Abstract
With the development of depth sensors in recent years, RGBD object tracking has received significant attention. Compared with the traditional RGB object tracking, the addition of the depth modality can effectively solve the target and background interference. However, some existing RGBD trackers use the two modalities separately and thus some particularly useful shared information between them is ignored. On the other hand, some methods attempt to fuse the two modalities by treating them equally, resulting in the missing of modality-specific features. To tackle these limitations, we propose a novel Dual-fused Modality-aware Tracker (termed DMTracker) which aims to learn informative and discriminative representations of the target objects for robust RGBD tracking. The first fusion module focuses on extracting the shared information between modalities based on cross-modal attention. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Infrared Thermography in Medicine
