Biclustering Via Sparse Clustering
Qian Liu, Guanhua Chen, Michael R. Kosorok, and Eric Bair

TL;DR
This paper introduces a flexible biclustering framework based on sparse clustering, capable of identifying subgroups with differences in means or variances across features, demonstrated on simulated and real data.
Contribution
It extends sparse clustering to effectively identify biclusters with various types of feature differences, improving accuracy and computational efficiency.
Findings
Outperforms existing methods in accuracy
Faster computation times
Effective on both simulated and real datasets
Abstract
In many situations it is desirable to identify clusters that differ with respect to only a subset of features. Such clusters may represent homogeneous subgroups of patients with a disease, such as cancer or chronic pain. We define a bicluster to be a submatrix U of a larger data matrix X such that the features and observations in U differ from those not contained in U. For example, the observations in U could have different means or variances with respect to the features in U. We propose a general framework for biclustering based on the sparse clustering method of Witten and Tibshirani (2010). We develop a method for identifying features that belong to biclusters. This framework can be used to identify biclusters that differ with respect to the means of the features, the variance of the features, or more general differences. We apply these methods to several simulated and real-world…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Advanced Clustering Algorithms Research
