Supervised clustering of high dimensional data using regularized mixture modeling
Wennan Chang, Changlin Wan, Yong Zang, Chi Zhang, Sha Cao

TL;DR
This paper introduces CSMR, a supervised clustering algorithm using penalized mixture regression, which improves the analysis of high-dimensional molecular data related to clinical phenotypes, aiding personalized medicine.
Contribution
The paper presents a novel supervised clustering method, CSMR, that enhances biological interpretability and computational efficiency in analyzing heterogeneous high-dimensional data.
Findings
CSMR accurately identifies explanatory feature subspaces.
CSMR outperforms baseline methods in simulated datasets.
CSMR effectively recapitulates subgroups in drug sensitivity data.
Abstract
Identifying relationships between molecular variations and their clinical presentations has been challenged by the heterogeneous causes of a disease. It is imperative to unveil the relationship between the high dimensional molecular manifestations and the clinical presentations, while taking into account the possible heterogeneity of the study subjects. We proposed a novel supervised clustering algorithm using penalized mixture regression model, called CSMR, to deal with the challenges in studying the heterogeneous relationships between high dimensional molecular features to a phenotype. The algorithm was adapted from the classification expectation maximization algorithm, which offers a novel supervised solution to the clustering problem, with substantial improvement on both the computational efficiency and biological interpretability. Experimental evaluation on simulated benchmark…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
