Identification of relevant subtypes via preweighted sparse clustering
Sheila Gaynor, Eric Bair

TL;DR
This paper introduces a modified sparse clustering method designed to identify biologically relevant subgroups associated with specific outcomes, especially in high-dimensional biomedical data, improving upon conventional clustering techniques.
Contribution
The paper proposes a preweighted sparse clustering approach that effectively detects outcome-related subgroups in high-variance feature data, addressing limitations of traditional methods.
Findings
Successfully identified outcome-associated subgroups in simulations
Applied method to temporomandibular disorder cohort data
Analyzed leukemia microarray dataset with promising results
Abstract
Cluster analysis methods are used to identify homogeneous subgroups in a data set. In biomedical applications, one frequently applies cluster analysis in order to identify biologically interesting subgroups. In particular, one may wish to identify subgroups that are associated with a particular outcome of interest. Conventional clustering methods generally do not identify such subgroups, particularly when there are a large number of high-variance features in the data set. Conventional methods may identify clusters associated with these high-variance features when one wishes to obtain secondary clusters that are more interesting biologically or more strongly associated with a particular outcome of interest. A modification of sparse clustering can be used to identify such secondary clusters or clusters associated with an outcome of interest. This method correctly identifies such clusters…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
