3D-Convolution Guided Spectral-Spatial Transformer for Hyperspectral   Image Classification

Shyam Varahagiri; Aryaman Sinha; Shiv Ram Dubey; Satish Kumar Singh

arXiv:2404.13252·cs.CV·April 23, 2024·1 cites

3D-Convolution Guided Spectral-Spatial Transformer for Hyperspectral Image Classification

Shyam Varahagiri, Aryaman Sinha, Shiv Ram Dubey, Satish Kumar Singh

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel 3D-Convolution guided Spectral-Spatial Transformer for hyperspectral image classification, effectively combining CNN and Transformer strengths to improve accuracy on multiple datasets.

Contribution

The paper proposes a 3D-Convolution guided module within a Transformer architecture, replacing class tokens with global average pooling for better spectral-spatial feature extraction.

Findings

01

Outperforms state-of-the-art models on three datasets

02

Demonstrates superior spectral-spatial feature fusion

03

Validates effectiveness of 3D-Convolution guidance

Abstract

In recent years, Vision Transformers (ViTs) have shown promising classification performance over Convolutional Neural Networks (CNNs) due to their self-attention mechanism. Many researchers have incorporated ViTs for Hyperspectral Image (HSI) classification. HSIs are characterised by narrow contiguous spectral bands, providing rich spectral data. Although ViTs excel with sequential data, they cannot extract spectral-spatial information like CNNs. Furthermore, to have high classification performance, there should be a strong interaction between the HSI token and the class (CLS) token. To solve these issues, we propose a 3D-Convolution guided Spectral-Spatial Transformer (3D-ConvSST) for HSI classification that utilizes a 3D-Convolution Guided Residual Module (CGRM) in-between encoders to "fuse" the local spatial and spectral information and to enhance the feature propagation.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shyamvarahagiri/3d-convsst
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRemote-Sensing Image Classification

MethodsAttention Is All You Need · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Absolute Position Encodings · Dropout · Dense Connections · Label Smoothing · Residual Connection · Softmax · Adam