BTMTrack: Robust RGB-T Tracking via Dual-template Bridging and Temporal-Modal Candidate Elimination
Zhongxuan Zhang, Bi Zeng, Xinyu Ni, Yimin Du

TL;DR
BTMTrack introduces a dual-template and temporal-modal candidate elimination framework for robust RGB-T tracking, effectively integrating temporal information and cross-modal interactions to improve accuracy in challenging conditions.
Contribution
The paper presents a novel RGB-T tracking framework with dual-template backbone, TMCE strategy, and TDTB module, enhancing temporal integration and cross-modal fusion for better tracking performance.
Findings
Achieves 72.3% precision on LasHeR dataset.
Outperforms existing methods on three benchmark datasets.
Demonstrates robustness in low-light and adverse weather conditions.
Abstract
RGB-T tracking leverages the complementary strengths of RGB and thermal infrared (TIR) modalities to address challenging scenarios such as low illumination and adverse weather. However, existing methods often fail to effectively integrate temporal information and perform efficient cross-modal interactions, which constrain their adaptability to dynamic targets. In this paper, we propose BTMTrack, a novel framework for RGB-T tracking. The core of our approach lies in the dual-template backbone network and the Temporal-Modal Candidate Elimination (TMCE) strategy. The dual-template backbone effectively integrates temporal information, while the TMCE strategy focuses the model on target-relevant tokens by evaluating temporal and modal correlations, reducing computational overhead and avoiding irrelevant background noise. Building upon this foundation, we propose the Temporal Dual Template…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Face and Expression Recognition
MethodsSparse Evolutionary Training
