Clustering-Based Outcome Models for Clinical Studies: A Scoping Review
Johannes Vilsmeier, Fabian Eibensteiner, Franz K\"onig, Francois Mercier, Robin Ristl, Nigel Stallard, Marc Vandemeulebroecke, Sarah Zohar, Martin Posch

TL;DR
This review systematically categorizes and discusses clustering-based outcome models in clinical research, highlighting their applications in high-dimensional data, risk stratification, and subgroup analysis for heterogeneous patient populations.
Contribution
It provides a comprehensive overview of informed and agnostic clustering models, summarizing 55 studies and their applications in biomedical and public health contexts.
Findings
Clustering models are useful for high-dimensional covariate data.
They support risk stratification and subgroup-specific treatment effects.
Applications include rare disease research and clinical trial analysis.
Abstract
This review provides a systematic overview of methods that combine covariate-based clustering of observational units (patients) with outcome models for clinical studies. We distinguish between informed-cluster models, where the outcome contributes to cluster formation, and agnostic-cluster models, where clustering is performed solely on covariates in a separate first step. Informed-cluster models include product partition models with covariates (PPMx), finite mixtures of regression models (FMR), and cluster-aware supervised learning (CluSL). Agnostic-cluster models encompass two-step procedures using either model-based or algorithmic clustering followed by cluster-specific regression models. Following a systematic search of Web of Science and PubMed, 55 records were identified that propose or evaluate such models. We describe the key models, summarise study characteristics, and present…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Genetic Associations and Epidemiology
