Imputing missing values in single-cell RNA-sequencing data: a statistical and machine learning-based approach
A F M Shamsuzzaman, Sumanta Ray, Anirban Mukhopadhyay

TL;DR
This paper introduces a new method called scDDI to better detect and fill in missing gene expression data in single-cell RNA sequencing.
Contribution
The novel scDDI method combines a Poisson–negative binomial mixture model with decision tree regression for improved dropout detection and imputation.
Findings
scDDI outperforms existing methods in dropout detection and imputation on both simulated and real datasets.
Improved imputation leads to better performance in downstream tasks like clustering and subpopulation identification.
Abstract
Single-cell RNA sequencing (scRNA-seq) offers a powerful tool to capture gene expression patterns within individual cells. However, due to the limited RNA content within cells, dropout events occur, resulting in a substantial number of zero counts in the single-cell expression matrix. To address this issue, we propose a novel method called single-cell dropout detection and imputation (scDDI). This method identifies dropout events using a Poisson–negative binomial mixture model and subsequently imputes the missing values using a decision tree regression model. We evaluate the performance of scDDI on both simulated and real scRNA-seq datasets, demonstrating its superiority over established single-cell imputation techniques. Notably, scDDI significantly improves dropout detection, leading to enhanced performance in various downstream analysis tasks like gene expression recovery, cell…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Cell Image Analysis Techniques · Extracellular vesicles in disease
