FANet: Quality-Aware Feature Aggregation Network for Robust RGB-T   Tracking

Yabin Zhu; Chenglong Li; Bin Luo; Jin Tang

arXiv:1811.09855·cs.CV·October 15, 2019·6 cites

FANet: Quality-Aware Feature Aggregation Network for Robust RGB-T Tracking

Yabin Zhu, Chenglong Li, Bin Luo, Jin Tang

PDF

Open Access

TL;DR

FANet is a deep learning architecture designed for robust RGB-T tracking, effectively aggregating features from both modalities to handle challenging conditions like occlusion and low illumination.

Contribution

The paper introduces FANet, a novel network that adaptively aggregates hierarchical features from RGB and thermal data for improved robustness in challenging tracking scenarios.

Findings

01

FANet outperforms state-of-the-art RGBT trackers on benchmark datasets.

02

The adaptive aggregation improves robustness against noise and low-quality sources.

03

Hierarchical feature integration enhances tracking accuracy under various adverse conditions.

Abstract

This paper investigates how to perform robust visual tracking in adverse and challenging conditions using complementary visual and thermal infrared data (RGBT tracking). We propose a novel deep network architecture called qualityaware Feature Aggregation Network (FANet) for robust RGBT tracking. Unlike existing RGBT trackers, our FANet aggregates hierarchical deep features within each modality to handle the challenge of significant appearance changes caused by deformation, low illumination, background clutter and occlusion. In particular, we employ the operations of max pooling to transform these hierarchical and multi-resolution features into uniform space with the same resolution, and use 1x1 convolution operation to compress feature dimensions to achieve more effective hierarchical feature aggregation. To model the interactions between RGB and thermal modalities, we elaborately…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Vision and Imaging · Image Enhancement Techniques

MethodsMax Pooling · 1x1 Convolution · Convolution