TL;DR
This paper introduces a non-parametric Bayesian method for analyzing digital gene expression data, effectively handling low biological replicates by clustering genes and inferring the number of clusters directly from the data.
Contribution
A novel hierarchical Bayesian model with a clustering approach that adapts to low-replicate RNA-seq data, improving analysis accuracy over existing methods.
Findings
Successfully models gene expression with high fidelity
Effectively compensates for low biological replicates
Infers the number of gene clusters from data
Abstract
Next-generation sequencing technologies provide a revolutionary tool for generating gene expression data. Starting with a fixed RNA sample, they construct a library of millions of differentially abundant short sequence tags or "reads", which constitute a fundamentally discrete measure of the level of gene expression. A common limitation in experiments using these technologies is the low number or even absence of biological replicates, which complicates the statistical analysis of digital gene expression data. Analysis of this type of data has often been based on modified tests originally devised for analysing microarrays; both these and even de novo methods for the analysis of RNA-seq data are plagued by the common problem of low replication. We propose a novel, non-parametric Bayesian approach for the analysis of digital gene expression data. We begin with a hierarchical model for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
