TL;DR
This paper introduces a new biclustering method based on profile likelihood that guarantees consistency in high-dimensional data and is applicable across various data types, with practical heuristics for computation.
Contribution
The paper proposes a novel profile likelihood biclustering procedure with theoretical consistency guarantees and a scalable heuristic optimization method.
Findings
Recovers true classes as data dimensions grow large
Effective across binary, count, and continuous data
Performs well in real-world applications
Abstract
Biclustering, the process of simultaneously clustering the rows and columns of a data matrix, is a popular and effective tool for finding structure in a high-dimensional dataset. Many biclustering procedures appear to work well in practice, but most do not have associated consistency guarantees. To address this shortcoming, we propose a new biclustering procedure based on profile likelihood. The procedure applies to a broad range of data modalities, including binary, count, and continuous observations. We prove that the procedure recovers the true row and column classes when the dimensions of the data matrix tend to infinity, even if the functional form of the data distribution is misspecified. The procedure requires computing a combinatorial search, which can be expensive in practice. Rather than performing this search directly, we propose a new heuristic optimization procedure based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
