LESSViT: Robust Hyperspectral Representation Learning under Spectral Configuration Shift

Haozhe Si; Yuxuan Wan; Yuqing Wang; Minh Do; Han Zhao

arXiv:2605.18541·cs.CV·May 19, 2026

LESSViT: Robust Hyperspectral Representation Learning under Spectral Configuration Shift

Haozhe Si, Yuxuan Wan, Yuqing Wang, Minh Do, Han Zhao

PDF

TL;DR

This paper introduces LESSViT, a low-rank, sensor-flexible Vision Transformer architecture for hyperspectral imagery that improves cross-sensor generalization by efficiently modeling spatial-spectral interactions.

Contribution

The paper proposes LESSViT with LESS Attention and HyperMAE for robust, efficient hyperspectral representation learning across different sensors and spectral configurations.

Findings

01

LESSViT enhances robustness under spectral shifts on the SpectralEarth benchmark.

02

The low-rank factorization reduces computational complexity significantly.

03

Explicit spatial-spectral modeling is crucial for scalable hyperspectral learning.

Abstract

Modeling hyperspectral imagery (HSI) across different sensors presents a fundamental challenge due to variations in wavelength coverage, band sampling, and channel dimensionality. As a result, models trained under a fixed spectral configuration often fail to generalize to other sensors. Existing Vision Transformer (ViT) approaches either rely on implicit spectral modeling with fixed channel assumptions or adopt explicit spatial-spectral attention with prohibitive computational cost, leading to a fundamental trade-off between efficiency and expressiveness. In this work, we introduce Low-rank Efficient Spatial-Spectral ViT (LESSViT), a sensor-flexible architecture for cross-spectral generalization. LESSViT is built on LESS Attention, a structured low-rank factorization that models joint spatial-spectral interactions through separable spatial and spectral components, reducing the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.