Transformers Fusion across Disjoint Samples for Hyperspectral Image Classification
Muhammad Ahmad, Manuel Mazzara, Salvatore Distifano

TL;DR
This paper proposes a novel attentional fusion of 3D Swin Transformer and Spatial-spectral Transformer to improve hyperspectral image classification, emphasizing disjoint sample training for robustness and superior performance.
Contribution
It introduces a new fusion method combining two transformer architectures specifically for hyperspectral image classification, emphasizing disjoint sample training for robustness.
Findings
Fusion approach outperforms traditional methods
Disjoint training enhances robustness
Achieves higher classification accuracy
Abstract
3D Swin Transformer (3D-ST) known for its hierarchical attention and window-based processing, excels in capturing intricate spatial relationships within images. Spatial-spectral Transformer (SST), meanwhile, specializes in modeling long-range dependencies through self-attention mechanisms. Therefore, this paper introduces a novel method: an attentional fusion of these two transformers to significantly enhance the classification performance of Hyperspectral Images (HSIs). What sets this approach apart is its emphasis on the integration of attentional mechanisms from both architectures. This integration not only refines the modeling of spatial and spectral information but also contributes to achieving more precise and accurate classification results. The experimentation and evaluation of benchmark HSI datasets underscore the importance of employing disjoint training, validation, and test…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification
MethodsAttention Is All You Need · Stochastic Depth · Dropout · Softmax · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Absolute Position Encodings · Linear Layer · Swin Transformer · Dense Connections
