MVNet: Hyperspectral Remote Sensing Image Classification Based on Hybrid Mamba-Transformer Vision Backbone Architecture

Guandong Li; Mengxia Ye

arXiv:2507.04409·cs.CV·July 8, 2025

MVNet: Hyperspectral Remote Sensing Image Classification Based on Hybrid Mamba-Transformer Vision Backbone Architecture

Guandong Li, Mengxia Ye

PDF

1 Repo

TL;DR

MVNet is a novel hybrid deep learning architecture that combines 3D-CNN, Transformer, and Mamba modules to improve hyperspectral image classification accuracy and efficiency, addressing high-dimensionality and spectral redundancy challenges.

Contribution

The paper introduces MVNet, a hybrid Mamba-Transformer architecture with a redesigned dual-branch Mamba module and optimized HSI-MambaVision Mixer, advancing spectral-spatial feature extraction in hyperspectral imaging.

Findings

01

Outperforms existing methods in accuracy on IN, UP, and KSC datasets.

02

Achieves higher computational efficiency and robustness.

03

Effectively models both short-range and long-range dependencies.

Abstract

Hyperspectral image (HSI) classification faces challenges such as high-dimensional data, limited training samples, and spectral redundancy, which often lead to overfitting and insufficient generalization capability. This paper proposes a novel MVNet network architecture that integrates 3D-CNN's local feature extraction, Transformer's global modeling, and Mamba's linear complexity sequence modeling capabilities, achieving efficient spatial-spectral feature extraction and fusion. MVNet features a redesigned dual-branch Mamba module, including a State Space Model (SSM) branch and a non-SSM branch employing 1D convolution with SiLU activation, enhancing modeling of both short-range and long-range dependencies while reducing computational latency in traditional Mamba. The optimized HSI-MambaVision Mixer module overcomes the unidirectional limitation of causal convolution, capturing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

leeguandong/MVNet-for-HSI
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Linear Unit · Mamba: Linear-Time Sequence Modeling with Selective State Spaces · Convolution