Negative binomial count splitting for single-cell RNA sequencing data
Anna Neufeld, Joshua Popp, Lucy L. Gao, Alexis Battle, Daniela, Witten

TL;DR
This paper introduces negative binomial count splitting for single-cell RNA sequencing data, enabling independent dataset creation for model validation in the presence of overdispersion.
Contribution
It extends Poisson count splitting to negative binomial models using Dirichlet-multinomial sampling, improving validation of scRNA-seq models with overdispersed data.
Findings
Outperforms Poisson count splitting in simulations
Successfully validated kidney cell clusters from human fetal atlas
Provides a flexible method for independent dataset generation
Abstract
The analysis of single-cell RNA sequencing (scRNA-seq) data often involves fitting a latent variable model to learn a low-dimensional representation for the cells. Validating such a model poses a major challenge. If we could sequence the same set of cells twice, we could use one dataset to fit a latent variable model and the other to validate it. In reality, we cannot sequence the same set of cells twice. Poisson count splitting was recently proposed as a way to work backwards from a single observed Poisson data matrix to obtain independent Poisson training and test matrices that could have arisen from two independent sequencing experiments conducted on the same set of cells. However, the Poisson count splitting approach requires that the original data are exactly Poisson distributed: in the presence of any overdispersion, the resulting training and test datasets are not independent. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Machine Learning and Algorithms · Bayesian Methods and Mixture Models
