Profile Likelihood Biclustering

Cheryl J. Flynn; Patrick O. Perry

arXiv:1206.6927·stat.ME·June 4, 2020

Profile Likelihood Biclustering

Cheryl J. Flynn, Patrick O. Perry

PDF

1 Repo

TL;DR

This paper introduces a new biclustering method based on profile likelihood that guarantees consistency in high-dimensional data and is applicable across various data types, with practical heuristics for computation.

Contribution

The paper proposes a novel profile likelihood biclustering procedure with theoretical consistency guarantees and a scalable heuristic optimization method.

Findings

01

Recovers true classes as data dimensions grow large

02

Effective across binary, count, and continuous data

03

Performs well in real-world applications

Abstract

Biclustering, the process of simultaneously clustering the rows and columns of a data matrix, is a popular and effective tool for finding structure in a high-dimensional dataset. Many biclustering procedures appear to work well in practice, but most do not have associated consistency guarantees. To address this shortcoming, we propose a new biclustering procedure based on profile likelihood. The procedure applies to a broad range of data modalities, including binary, count, and continuous observations. We prove that the procedure recovers the true row and column classes when the dimensions of the data matrix tend to infinity, even if the functional form of the data distribution is misspecified. The procedure requires computing a combinatorial search, which can be expensive in practice. Rather than performing this search directly, we propose a new heuristic optimization procedure based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

patperry/biclustpl
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.