TL;DR
This paper introduces a multivariate Hurdle model for inferring gene co-regulatory networks from zero-inflated single-cell gene expression data, outperforming existing methods in sensitivity and revealing new network structures.
Contribution
The study proposes a novel multivariate Hurdle model combined with neighborhood selection for graphical modeling of zero-inflated single-cell data, improving network inference accuracy.
Findings
More sensitive network inference than existing methods in simulations.
Successfully applied to T follicular helper cells and dendritic cells data.
Reveals network structures not detectable in bulk data.
Abstract
Bulk gene expression experiments relied on aggregations of thousands of cells to measure the average expression in an organism. Advances in microfluidic and droplet sequencing now permit expression profiling in single cells. This study of cell-to-cell variation reveals that individual cells lack detectable expression of transcripts that appear abundant on a population level, giving rise to zero-inflated expression patterns. To infer gene co-regulatory networks from such data, we propose a multivariate Hurdle model. It is comprised of a mixture of singular Gaussian distributions. We employ neighborhood selection with the pseudo-likelihood and a group lasso penalty to select and fit undirected graphical models that capture conditional independences between genes. The proposed method is more sensitive than existing approaches in simulations, even under departures from our Hurdle model. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
