Hybrid Transformer-Mamba Architecture for Weakly Supervised Volumetric Medical Segmentation

Yiheng Lyu; Lian Xu; Mohammed Bennamoun; Farid Boussaid; Coen Arrow; Girish Dwivedi

arXiv:2512.10353·cs.CV·December 12, 2025

Hybrid Transformer-Mamba Architecture for Weakly Supervised Volumetric Medical Segmentation

Yiheng Lyu, Lian Xu, Mohammed Bennamoun, Farid Boussaid, Coen Arrow, Girish Dwivedi

PDF

Open Access

TL;DR

This paper introduces TranSamba, a hybrid Transformer-Mamba architecture that efficiently captures 3D context for weakly supervised volumetric medical segmentation, achieving state-of-the-art results.

Contribution

The paper presents a novel hybrid architecture combining Vision Transformer with Cross-Plane Mamba blocks for volumetric medical segmentation.

Findings

01

Outperforms existing methods across multiple datasets.

02

Achieves linear time complexity with respect to volume depth.

03

Maintains constant memory usage during batch processing.

Abstract

Weakly supervised semantic segmentation offers a label-efficient solution to train segmentation models for volumetric medical imaging. However, existing approaches often rely on 2D encoders that neglect the inherent volumetric nature of the data. We propose TranSamba, a hybrid Transformer-Mamba architecture designed to capture 3D context for weakly supervised volumetric medical segmentation. TranSamba augments a standard Vision Transformer backbone with Cross-Plane Mamba blocks, which leverage the linear complexity of state space models for efficient information exchange across neighboring slices. The information exchange enhances the pairwise self-attention within slices computed by the Transformer blocks, directly contributing to the attention maps for object localization. TranSamba achieves effective volumetric modeling with time complexity that scales linearly with the input volume…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · 3D Shape Modeling and Analysis · AI in cancer detection