SSMamba: A Self-Supervised Hybrid State Space Model for Pathological Image Classification

Enhui Chai; Sicheng Chen; Tianyi Zhang; Xingyu Li; Tianxiang Cui

arXiv:2604.15711·cs.CV·May 8, 2026

SSMamba: A Self-Supervised Hybrid State Space Model for Pathological Image Classification

Enhui Chai, Sicheng Chen, Tianyi Zhang, Xingyu Li, Tianxiang Cui

PDF

TL;DR

SSMamba is a self-supervised hybrid model designed to improve pathological image classification by addressing domain shift, local-global relationship modeling, and fine-grained sensitivity.

Contribution

It introduces a novel framework with domain-adaptive modules that enhance feature learning without large external datasets, outperforming existing methods.

Findings

01

Outperforms 11 SOTA models on 10 ROI datasets

02

Surpasses 8 SOTA methods on 6 WSI datasets

03

Effectively mitigates domain shift and enhances fine-grained sensitivity

Abstract

Pathological diagnosis is highly reliant on image analysis, where Regions of Interest (ROIs) serve as the primary basis for diagnostic evidence, while whole-slide image (WSI)-level tasks primarily capture aggregated patterns. To extract these critical morphological features, ROI-level Foundation Models (FMs) based on Vision Transformers (ViTs) and large-scale self-supervised learning (SSL) have been widely adopted. However, three core limitations remain in their application to ROI analysis: (1) cross-magnification domain shift, as fixed-scale pretraining hinders adaptation to diverse clinical settings; (2) inadequate local-global relationship modeling, wherein the ViT backbone of FMs suffers from high computational overhead and imprecise local characterization; (3) insufficient fine-grained sensitivity, as traditional self-attention mechanisms tend to overlook subtle diagnostic cues. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.