HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model
Di Wang, Meiqi Hu, Yao Jin, Yuchun Miao, Jiaqi Yang, Yichu Xu, Xiaolei, Qin, Jiaqi Ma, Lingyu Sun, Chenxing Li, Chuan Fu, Hongruixuan Chen, Chengxi, Han, Naoto Yokoya, Jing Zhang, Minqiang Xu, Lin Liu, Lefei Zhang, Chen Wu, Bo, Du, Dacheng Tao, Liangpei Zhang

TL;DR
HyperSIGMA is a scalable, transformer-based foundation model for hyperspectral image interpretation that unifies multiple tasks and scenes, leveraging a novel sparse sampling attention mechanism and a large-scale dataset for superior performance.
Contribution
The paper introduces HyperSIGMA, a vision transformer model with a novel SSA mechanism and a large hyperspectral dataset, enabling cross-task and cross-scene HSI interpretation.
Findings
HyperSIGMA outperforms state-of-the-art methods across various HSI tasks.
The SSA mechanism effectively captures diverse contextual features.
HyperSIGMA demonstrates high scalability, robustness, and real-world applicability.
Abstract
Accurate hyperspectral image (HSI) interpretation is critical for providing valuable insights into various earth observation-related applications such as urban planning, precision agriculture, and environmental monitoring. However, existing HSI processing methods are predominantly task-specific and scene-dependent, which severely limits their ability to transfer knowledge across tasks and scenes, thereby reducing the practicality in real-world applications. To address these challenges, we present HyperSIGMA, a vision transformer-based foundation model that unifies HSI interpretation across tasks and scenes, scalable to over one billion parameters. To overcome the spectral and spatial redundancy inherent in HSIs, we introduce a novel sparse sampling attention (SSA) mechanism, which effectively promotes the learning of diverse contextual features and serves as the basic block of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification
MethodsSoftmax · Attention Is All You Need
