Bipartite Graph Attention-based Clustering for Large-scale scRNA-seq Data

Zhuomin Liang; Liang Bai; Xian Yang

arXiv:2602.07475·cs.LG·February 10, 2026

Bipartite Graph Attention-based Clustering for Large-scale scRNA-seq Data

Zhuomin Liang, Liang Bai, Xian Yang

PDF

Open Access

TL;DR

This paper introduces BGFormer, a bipartite graph transformer model that uses learnable anchor tokens and bipartite attention to efficiently cluster large-scale scRNA-seq data, overcoming previous quadratic complexity limitations.

Contribution

The paper presents a novel bipartite graph attention mechanism with learnable anchors, enabling linear scalability for large scRNA-seq clustering tasks.

Findings

01

BGFormer achieves linear complexity with respect to cell number.

02

Experimental results show improved scalability and clustering accuracy.

03

The model effectively groups cells in large-scale datasets.

Abstract

scRNA-seq clustering is a critical task for analyzing single-cell RNA sequencing (scRNA-seq) data, as it groups cells with similar gene expression profiles. Transformers, as powerful foundational models, have been applied to scRNA-seq clustering. Their self-attention mechanism automatically assigns higher attention weights to cells within the same cluster, enhancing the distinction between clusters. Existing methods for scRNA-seq clustering, such as graph transformer-based models, treat each cell as a token in a sequence. Their computational and space complexities are $O (n^{2})$ with respect to the number of cells, limiting their applicability to large-scale scRNA-seq datasets.To address this challenge, we propose a Bipartite Graph Transformer-based clustering model (BGFormer) for scRNA-seq data. We introduce a set of learnable anchor tokens as shared reference points to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSingle-cell and spatial transcriptomics · Domain Adaptation and Few-Shot Learning · Bioinformatics and Genomic Networks