FOCUS: Fused Observation of Channels for Unveiling Spectra

Xi Xiao; Aristeidis Tsaris; Anika Tabassum; John Lagergren; Larry M. York; Tianyang Wang; Xiao Wang

arXiv:2507.14787·cs.CV·April 28, 2026

FOCUS: Fused Observation of Channels for Unveiling Spectra

Xi Xiao, Aristeidis Tsaris, Anika Tabassum, John Lagergren, Larry M. York, Tianyang Wang, Xiao Wang

PDF

TL;DR

FOCUS is a novel framework that enhances the interpretability of Vision Transformers in hyperspectral imaging by generating stable, spectral-aware saliency maps efficiently without modifying the model.

Contribution

It introduces class-specific spectral prompts and a learnable [SINK] token to improve spectral interpretability and stability in frozen ViTs for hyperspectral data.

Findings

01

Increases band-level IoU by 15%

02

Reduces attention collapse by over 40%

03

Produces saliency maps aligning with expert annotations

Abstract

Hyperspectral imaging (HSI) captures hundreds of narrow, contiguous wavelength bands, making it a powerful tool in biology, agriculture, and environmental monitoring. However, interpreting Vision Transformers (ViTs) in this setting remains largely unexplored due to two key challenges: (1) existing saliency methods struggle to capture meaningful spectral cues, often collapsing attention onto the class token, and (2) full-spectrum ViTs are computationally prohibitive for interpretability, given the high-dimensional nature of HSI data. We present FOCUS, the first framework that enables reliable and efficient spatial-spectral interpretability for frozen ViTs. FOCUS introduces two core components: class-specific spectral prompts that guide attention toward semantically meaningful wavelength groups, and a learnable [SINK] token trained with an attraction loss to absorb noisy or redundant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.