GeneMamba: An Efficient and Effective Foundation Model on Single Cell Data
Cong Qi, Hanzhang Fang, Siqi Jiang, Xun Song, Tianxing Hu, and Wei Zhi

TL;DR
GeneMamba is a scalable, efficient foundation model for single-cell transcriptomics that uses state space modeling to handle high-dimensional, sparse data with improved computational efficiency and biological interpretability.
Contribution
It introduces a novel state space model architecture, Bi-Mamba, for single-cell data, enabling linear-time processing and integrating biologically informed objectives.
Findings
Outperforms transformer baselines in multiple tasks
Demonstrates strong interpretability and robustness
Pretrained on nearly 30 million cells
Abstract
Single-cell RNA sequencing (scRNA-seq) enables high-resolution analysis of cellular heterogeneity, but its complexity, which is marked by high dimensionality, sparsity, and batch effects, which poses major computational challenges. Transformer-based models have made significant advances in this domain but are often limited by their quadratic complexity and suboptimal handling of long-range dependencies. In this work, we introduce GeneMamba, a scalable and efficient foundation model for single-cell transcriptomics built on state space modeling. Leveraging the Bi-Mamba architecture, GeneMamba captures bidirectional gene context with linear-time complexity, offering substantial computational gains over transformer baselines. The model is pretrained on nearly 30 million cells and incorporates biologically informed objectives, including pathway-aware contrastive loss and rank-based gene…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics
