Bayesian clustering of replicated time-course gene expression data with weak signals
Audrey Qiuyan Fu, Steven Russell, Sarah J. Bray, Simon Tavar\'e

TL;DR
This paper introduces a Bayesian clustering method for time-course gene expression data with replicates and weak signals, using a Dirichlet-process mixture model and novel MCMC sampling to identify dynamic gene expression patterns.
Contribution
It develops a probabilistic clustering approach with a new MCMC algorithm and a two-step inference procedure, enabling effective analysis of noisy, replicated time-course data.
Findings
Identified 14 gene clusters in Drosophila data revealing transcriptional responses.
Demonstrated the method's effectiveness through simulated data.
Provided an R package 'DIRECT' for implementation.
Abstract
To identify novel dynamic patterns of gene expression, we develop a statistical method to cluster noisy measurements of gene expression collected from multiple replicates at multiple time points, with an unknown number of clusters. We propose a random-effects mixture model coupled with a Dirichlet-process prior for clustering. The mixture model formulation allows for probabilistic cluster assignments. The random-effects formulation allows for attributing the total variability in the data to the sources that are consistent with the experimental design, particularly when the noise level is high and the temporal dependence is not strong. The Dirichlet-process prior induces a prior distribution on partitions and helps to estimate the number of clusters (or mixture components) from the data. We further tackle two challenges associated with Dirichlet-process prior-based methods. One is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
