Exploring Fusion Strategies for Accurate RGBT Visual Object Tracking
Zhangyong Tang (1), Tianyang Xu (1), Hui Li (1), Xiao-Jun Wu (1),, Xuefeng Zhu (1), Josef Kittler (2) ((1) Jiangnan University, Wuxi, China,, (2) University of Surrey, UK)

TL;DR
This paper investigates multi-modal RGBT object tracking by exploring pixel-, feature-, and decision-level fusion strategies, introducing a novel decision-level fusion method that improves tracking accuracy and robustness.
Contribution
It proposes a new decision-level fusion strategy with dynamic weighting and linear template update, outperforming existing methods and winning the VOT-RGBT2020 challenge.
Findings
Decision-level fusion outperforms pixel- and feature-level methods.
The proposed method achieves state-of-the-art results on multiple datasets.
The decision-level approach enhances robustness and accuracy in RGBT tracking.
Abstract
We address the problem of multi-modal object tracking in video and explore various options of fusing the complementary information conveyed by the visible (RGB) and thermal infrared (TIR) modalities including pixel-level, feature-level and decision-level fusion. Specifically, different from the existing methods, paradigm of image fusion task is heeded for fusion at pixel level. Feature-level fusion is fulfilled by attention mechanism with channels excited optionally. Besides, at decision level, a novel fusion strategy is put forward since an effortless averaging configuration has shown the superiority. The effectiveness of the proposed decision-level fusion strategy owes to a number of innovative contributions, including a dynamic weighting of the RGB and TIR contributions and a linear template update operation. A variant of which produced the winning tracker at the Visual Object…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Fusion Techniques · Video Surveillance and Tracking Methods · Visual Attention and Saliency Detection
