VP-Hype: A Hybrid Mamba-Transformer Framework with Visual-Textual Prompting for Hyperspectral Image Classification

Abdellah Zakaria Sellam; Fadi Abdeladhim Zidi; Salah Eddine Bekhouche; Ihssen Houhou; Marouane Tliba; Cosimo Distante; Abdenour Hadid

arXiv:2603.01174·cs.CV·March 3, 2026

VP-Hype: A Hybrid Mamba-Transformer Framework with Visual-Textual Prompting for Hyperspectral Image Classification

Abdellah Zakaria Sellam, Fadi Abdeladhim Zidi, Salah Eddine Bekhouche, Ihssen Houhou, Marouane Tliba, Cosimo Distante, Abdenour Hadid

PDF

Open Access

TL;DR

VP-Hype introduces a hybrid Mamba-Transformer framework with visual-textual prompting that significantly improves hyperspectral image classification accuracy in low-data scenarios by combining efficient sequence modeling with multi-modal guidance.

Contribution

The paper presents a novel hybrid architecture unifying State-Space Models and Transformers, along with dual-modal prompts, to enhance hyperspectral image classification under label scarcity.

Findings

01

Achieves over 99.6% accuracy with only 2% training data.

02

Outperforms existing methods in low-data regimes.

03

Reduces computational complexity compared to standard Transformers.

Abstract

Accurate classification of hyperspectral imagery (HSI) is often frustrated by the tension between high-dimensional spectral data and the extreme scarcity of labeled training samples. While hierarchical models like LoLA-SpecViT have demonstrated the power of local windowed attention and parameter-efficient fine-tuning, the quadratic complexity of standard Transformers remains a barrier to scaling. We introduce VP-Hype, a framework that rethinks HSI classification by unifying the linear-time efficiency of State-Space Models (SSMs) with the relational modeling of Transformers in a novel hybrid architecture. Building on a robust 3D-CNN spectral front-end, VP-Hype replaces conventional attention blocks with a Hybrid Mamba-Transformer backbone to capture long-range dependencies with significantly reduced computational overhead. Furthermore, we address the label-scarcity problem by integrating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRemote-Sensing Image Classification · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning