TL;DR
This paper introduces LAS, a statistically motivated biclustering method that identifies large average submatrices in high-dimensional data, demonstrating its effectiveness in gene expression analysis and disease subtype classification.
Contribution
LAS is a novel iterative biclustering procedure that uses a Bonferroni-based significance score to find large, meaningful submatrices in high-dimensional data.
Findings
LAS effectively identifies biologically relevant biclusters.
LAS outperforms existing methods in validation studies.
LAS is sensitive to noise but useful for exploratory analysis.
Abstract
The search for sample-variable associations is an important problem in the exploratory analysis of high dimensional data. Biclustering methods search for sample-variable associations in the form of distinguished submatrices of the data matrix. (The rows and columns of a submatrix need not be contiguous.) In this paper we propose and evaluate a statistically motivated biclustering procedure (LAS) that finds large average submatrices within a given real-valued data matrix. The procedure operates in an iterative-residual fashion, and is driven by a Bonferroni-based significance score that effectively trades off between submatrix size and average value. We examine the performance and potential utility of LAS, and compare it with a number of existing methods, through an extensive three-part validation study using two gene expression datasets. The validation study examines quantitative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
