SCALE:Scalable Conditional Atlas-Level Endpoint transport for virtual cell perturbation prediction
Shuizhou Chen, Lang Yu, Kedu Jin, Songming Zhang, Hao Wu, Wenxuan Huang, Sheng Xu, Quan Qian, Qin Chen, Lei Bai, Siqi Sun, Zhangyang Gao

TL;DR
This paper introduces SCALE, a scalable foundation model for virtual cell perturbation prediction that improves training efficiency, stability, and biological fidelity in large-scale single-cell data modeling.
Contribution
SCALE combines a BioNeMo-based framework with a set-aware flow architecture to enhance scalability, stability, and biological relevance in virtual cell perturbation prediction.
Findings
12.51x faster pretraining and 1.29x faster inference
12.02% improvement in PDCorr metric
10.66% improvement in DE Overlap metric
Abstract
Virtual cell models aim to enable in silico experimentation by predicting how cells respond to genetic, chemical, or cytokine perturbations from single-cell measurements. In practice, however, large-scale perturbation prediction remains constrained by three coupled bottlenecks: inefficient training and inference pipelines, unstable modeling in high-dimensional sparse expression space, and evaluation protocols that overemphasize reconstruction-like accuracy while underestimating biological fidelity. In this work we present a specialized large-scale foundation model SCALE for virtual cell perturbation prediction that addresses the above limitations jointly. First, we build a BioNeMo-based training and inference framework that substantially improves data throughput, distributed scalability, and deployment efficiency, yielding 12.51* speedup on pretrain and 1.29* on inference over the prior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Cell Image Analysis Techniques · Gene Regulatory Network Analysis
