A Dirichlet Process Mixture Model for Clustering Longitudinal Gene Expression Data
Jiehuan Sun, Jose D. Herazo-Maya, Naftali Kaminski, Hongyu Zhao,, Joshua L. Warren

TL;DR
This paper introduces BClustLonG, a Bayesian clustering method utilizing longitudinal gene expression data to improve subgroup identification in biomedical research, accounting for intra-individual variability and gene correlations.
Contribution
It develops a novel Bayesian model combining linear mixed-effects and factor analysis with Dirichlet process priors for effective clustering of longitudinal gene expression data.
Findings
BClustLonG outperforms existing clustering methods in simulations.
Successfully distinguishes burn from trauma patients.
Identifies meaningful subgroups within trauma patients.
Abstract
Subgroup identification (clustering) is an important problem in biomedical research. Gene expression profiles are commonly utilized to define subgroups. Longitudinal gene expression profiles might provide additional information on disease progression than what is captured by baseline profiles alone. Moreover, the longitudinal gene expression data allows for intra-individual variability to be accounted for when grouping patients. Therefore, subgroup identification could be more accurate and effective with the aid of longitudinal gene expression data. However, existing statistical methods are unable to fully utilize these data for patient clustering. In this article, we introduce a novel subgroup identification method in the Bayesian setting based on longitudinal gene expression profiles. This method, called BClustLonG, adopts a linear mixed-effects framework to model the trajectory of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Gene expression and cancer classification
