X-VMamba: Explainable Vision Mamba

Mohamed A. Mabrok; Yalda Zafari

arXiv:2511.12694·cs.CV·November 18, 2025

X-VMamba: Explainable Vision Mamba

Mohamed A. Mabrok, Yalda Zafari

PDF

Open Access

TL;DR

This paper introduces a controllability-based interpretability framework for Vision State Space Models, enabling transparent analysis of how input parts influence internal states with linear complexity, validated on medical imaging data.

Contribution

It proposes a novel, efficient interpretability method for SSMs using Jacobian and Gramian approaches, applicable across architectures without modifications.

Findings

01

Revealed hierarchical feature refinement in medical imaging SSMs

02

Identified domain-specific controllability signatures

03

Showed influence of scanning strategies on attention patterns

Abstract

State Space Models (SSMs), particularly the Mamba architecture, have recently emerged as powerful alternatives to Transformers for sequence modeling, offering linear computational complexity while achieving competitive performance. Yet, despite their effectiveness, understanding how these Vision SSMs process spatial information remains challenging due to the lack of transparent, attention-like mechanisms. To address this gap, we introduce a controllability-based interpretability framework that quantifies how different parts of the input sequence (tokens or patches) influence the internal state dynamics of SSMs. We propose two complementary formulations: a Jacobian-based method applicable to any SSM architecture that measures influence through the full chain of state propagation, and a Gramian-based approach for diagonal SSMs that achieves superior speed through closed-form analytical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Machine Learning in Healthcare