MoEMambaMIL: Structure-Aware Selective State Space Modeling for Whole-Slide Image Analysis
Dongqing Xie, Yonghuang Wu

TL;DR
MoEMambaMIL introduces a structure-aware state space model that effectively captures hierarchical spatial dependencies in whole-slide images, significantly improving performance on multiple diagnostic tasks.
Contribution
It presents a novel framework combining region-nested selective scanning with mixture-of-experts in state space models for hierarchical WSI analysis.
Findings
Achieves state-of-the-art results on 9 downstream tasks.
Effectively models long sequences with hierarchical spatial structure.
Promotes expert specialization for diverse diagnostic patterns.
Abstract
Whole-slide image (WSI) analysis is challenging due to the gigapixel scale of slides and their inherent hierarchical multi-resolution structure. Existing multiple instance learning (MIL) approaches often model WSIs as unordered collections of patches, which limits their ability to capture structured dependencies between global tissue organization and local cellular patterns. Although recent State Space Models (SSMs) enable efficient modeling of long sequences, how to structure WSI tokens to fully exploit their spatial hierarchy remains an open problem.We propose MoEMambaMIL, a structure-aware SSM framework for WSI analysis that integrates region-nested selective scanning with mixture-of-experts (MoE) modeling. Leveraging multi-resolution preprocessing, MoEMambaMIL organizes patch tokens into region-aware sequences that preserve spatial containment across resolutions. On top of this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques · AI in cancer detection · Medical Image Segmentation Techniques
