FANet: Quality-Aware Feature Aggregation Network for Robust RGB-T Tracking
Yabin Zhu, Chenglong Li, Bin Luo, Jin Tang

TL;DR
FANet is a deep learning architecture designed for robust RGB-T tracking, effectively aggregating features from both modalities to handle challenging conditions like occlusion and low illumination.
Contribution
The paper introduces FANet, a novel network that adaptively aggregates hierarchical features from RGB and thermal data for improved robustness in challenging tracking scenarios.
Findings
FANet outperforms state-of-the-art RGBT trackers on benchmark datasets.
The adaptive aggregation improves robustness against noise and low-quality sources.
Hierarchical feature integration enhances tracking accuracy under various adverse conditions.
Abstract
This paper investigates how to perform robust visual tracking in adverse and challenging conditions using complementary visual and thermal infrared data (RGBT tracking). We propose a novel deep network architecture called qualityaware Feature Aggregation Network (FANet) for robust RGBT tracking. Unlike existing RGBT trackers, our FANet aggregates hierarchical deep features within each modality to handle the challenge of significant appearance changes caused by deformation, low illumination, background clutter and occlusion. In particular, we employ the operations of max pooling to transform these hierarchical and multi-resolution features into uniform space with the same resolution, and use 1x1 convolution operation to compress feature dimensions to achieve more effective hierarchical feature aggregation. To model the interactions between RGB and thermal modalities, we elaborately…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Vision and Imaging · Image Enhancement Techniques
MethodsMax Pooling · 1x1 Convolution · Convolution
