ZINBGT: Exploratory Data Analysis of Single-Cell Transcriptomic Expression Using Mixture Models
Toby Kettlewell, Yiyi Cheng, Thomas D. Otto, Vincent Macaulay, Mayetri Gupta

TL;DR
This paper introduces ZINBGT, a mixture-model-based visualization method for single-cell transcriptomic data that provides interpretable insights and diagnostic summaries, addressing limitations of traditional visualization techniques.
Contribution
The paper presents ZINBGT, a novel mixture model for visualizing gene expression in single-cell data, with diagnostic tools to assess data quality and model fit.
Findings
ZINBGT reveals outlier genes in T. brucei samples.
Application to human immune cells highlights relationships between sparsity, mean, and spread.
Discrepancies found between simulated and real datasets, questioning simulation validity.
Abstract
Single-cell transcriptomic data approximates the abundance of proteins at a high resolution, but its noisiness necessitates transformation by a pipeline of methods before analysis and inference. In the absence of robust validation of these pipelines and methods, it remains unclear how best to process any particular dataset. To compensate for this, popular visualisation methods, e.g., t-SNE and UMAP, are commonly used to produce descriptions of datasets. Such visualisations are incomplete and provide subjective descriptions of samples rather than statistically meaningful statements about technical noise or biology. In this paper, we introduce the Zero-Inflated Negative-Binomial with Geometric Tail (ZINBGT), a mixture-model-based strategy for producing interpretable visualisations of each gene's expression across cells, along with diagnostic summaries that use Wasserstein distance to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
