# Hyperspectral Target Tracking via Spatial–Spectral Attention Weight Variance Gradient and Depth Contrast Enhancement

**Authors:** Yao Yu, Mingkai Ge, Jie Yu, Isaac Kwesi Nooni, Pattathal Vijayakumar Arun, Dong Zhao

PMC · DOI: 10.3390/s26041327 · 2026-02-19

## TL;DR

This paper introduces a new hyperspectral target tracking method that uses spatial-spectral attention and depth contrast to handle scale variations effectively.

## Contribution

The novel approach combines spatial-spectral attention weight variance gradient with depth contrast enhancement for robust tracking.

## Key findings

- The proposed method achieves an AUC of 0.6704 and a DP@20 of 0.9455 on hyperspectral video sequences.
- It outperforms existing methods by 3.1% in robustness to scale variations.
- Depth-aware geometric constraints improve appearance modeling adaptability.

## Abstract

Scale variations pose a significant challenge in hyperspectral target tracking. To address this challenge, we propose a method that leverages spatial–spectral attention mechanisms combined with depth estimation to enhance the capabilities of the tracker. First and foremost, the method processes raw hyperspectral video inputs through spatial–spectral attention weight variance gradient, utilizing variance gradient for effective dimensionality reduction and obtaining fused spatial–spectral attention weights for subsequent tracking. Moreover, our method integrates a dual-path preprocessing module for handling template and search regions, coupled with a Vision Transformer encoder that incorporates depth contrast enhancement. Last but not least, the proposed tracker is enhanced by the weight adaptive mixed fusion that optimizes the fusion of the fused spatial–spectral attention weights with enhanced depth contrast. The key advantage of our proposed method lies in depth-aware geometric constraints and the use of spectral–spatial information, which enables robust appearance modeling that intrinsically adapts to target scale variations. Extensive experiments on hyperspectral video sequences demonstrate that our method achieves state-of-the-art performance, with an AUC of 0.6704 and a DP@20 of 0.9455, outperforming existing state-of-the-art methods by 3.1% in robustness to scale variations.

## Full-text entities

- **Diseases:** injury to (MESH:D014947), and Loss (MESH:D016388), HOTC (MESH:D014012), WAMF (MESH:D018489)
- **Chemicals:** DCE (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12944065/full.md

---
Source: https://tomesphere.com/paper/PMC12944065