Scaling Laws for Masked-Reconstruction Transformers on Single-Cell Transcriptomics
Ihor Kendiukhov

TL;DR
This study investigates whether neural scaling laws, known in NLP and vision, also apply to single-cell transcriptomics using masked-reconstruction transformers, revealing data-dependent scaling behaviors.
Contribution
First systematic analysis of scaling laws in single-cell genomics with transformers, identifying data availability as key to scaling behavior and establishing parallels with NLP models.
Findings
Scaling laws exist in data-rich regimes with a power-law relationship.
Model capacity limits are less relevant in data-scarce settings.
Estimated entropy per masked gene position is approximately 2.30 bits.
Abstract
Neural scaling laws -- power-law relationships between loss, model size, and data -- have been extensively documented for language and vision transformers, yet their existence in single-cell genomics remains largely unexplored. We present the first systematic study of scaling behaviour for masked-reconstruction transformers trained on single-cell RNA sequencing (scRNA-seq) data. Using expression profiles from the CELLxGENE Census, we construct two experimental regimes: a data-rich regime (512 highly variable genes, 200,000 cells) and a data-limited regime (1,024 genes, 10,000 cells). Across seven model sizes spanning three orders of magnitude in parameter count (533 to 3.4 x 10^8 parameters), we fit the parametric scaling law to validation mean squared error (MSE). The data-rich regime exhibits clear power-law scaling with an irreducible loss floor of c ~ 1.44, while the data-limited…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Language and cultural evolution · Neural dynamics and brain function
