SUIT: Spatial-Spectral Union-Intersection Interaction Network for Hyperspectral Object Tracking
Fengchao Xiong, Zhenxing Wu, Sen Jia, Yuntao Qian

TL;DR
This paper introduces SUIT, a hyperspectral object tracking network that effectively models spatial and spectral interactions using Transformers and a novel spectral loss, achieving state-of-the-art results in challenging scenarios.
Contribution
The paper proposes a new architecture combining Transformers and set theory-based spectral modeling, along with a spectral loss, to improve hyperspectral object tracking.
Findings
Achieves state-of-the-art tracking performance on hyperspectral datasets.
Effectively models spectral interactions with a novel union of spatial cues.
Enhances robustness to shape deformation and appearance changes.
Abstract
Hyperspectral videos (HSVs), with their inherent spatial-spectral-temporal structure, offer distinct advantages in challenging tracking scenarios such as cluttered backgrounds and small objects. However, existing methods primarily focus on spatial interactions between the template and search regions, often overlooking spectral interactions, leading to suboptimal performance. To address this issue, this paper investigates spectral interactions from both the architectural and training perspectives. At the architectural level, we first establish band-wise long-range spatial relationships between the template and search regions using Transformers. We then model spectral interactions using the inclusion-exclusion principle from set theory, treating them as the union of spatial interactions across all bands. This enables the effective integration of both shared and band-specific spatial cues.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Visual Attention and Saliency Detection · Face recognition and analysis
