TL;DR
MVNet is a novel hybrid deep learning architecture that combines 3D-CNN, Transformer, and Mamba modules to improve hyperspectral image classification accuracy and efficiency, addressing high-dimensionality and spectral redundancy challenges.
Contribution
The paper introduces MVNet, a hybrid Mamba-Transformer architecture with a redesigned dual-branch Mamba module and optimized HSI-MambaVision Mixer, advancing spectral-spatial feature extraction in hyperspectral imaging.
Findings
Outperforms existing methods in accuracy on IN, UP, and KSC datasets.
Achieves higher computational efficiency and robustness.
Effectively models both short-range and long-range dependencies.
Abstract
Hyperspectral image (HSI) classification faces challenges such as high-dimensional data, limited training samples, and spectral redundancy, which often lead to overfitting and insufficient generalization capability. This paper proposes a novel MVNet network architecture that integrates 3D-CNN's local feature extraction, Transformer's global modeling, and Mamba's linear complexity sequence modeling capabilities, achieving efficient spatial-spectral feature extraction and fusion. MVNet features a redesigned dual-branch Mamba module, including a State Space Model (SSM) branch and a non-SSM branch employing 1D convolution with SiLU activation, enhancing modeling of both short-range and long-range dependencies while reducing computational latency in traditional Mamba. The optimized HSI-MambaVision Mixer module overcomes the unidirectional limitation of causal convolution, capturing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSigmoid Linear Unit · Mamba: Linear-Time Sequence Modeling with Selective State Spaces · Convolution
