BiSSLB: Binary Spike-and-Slab Lasso Biclustering
Sijian Fan, Ray Bai

TL;DR
BiSSLB introduces a Bayesian biclustering method for binary data that is noise-robust, scalable, and capable of discovering overlapping biclusters of various sizes without prior knowledge of data characteristics.
Contribution
The paper presents a novel Bayesian biclustering approach for binary data using spike-and-slab priors and IBP, with a scalable coordinate ascent algorithm, outperforming existing methods under noisy conditions.
Findings
BiSSLB outperforms state-of-the-art methods on simulated data.
Effective in real SNP and PPI datasets with high noise levels.
Automatically determines the number of biclusters from data.
Abstract
Biclustering is a powerful unsupervised learning technique for simultaneously identifying coherent subsets of rows and columns in a data matrix, thus revealing local patterns that may not be apparent in global analyses. However, most biclustering methods are developed for continuous data and are not applicable for binary datasets such as single-nucleotide polymorphism (SNP) or protein-protein interaction (PPI) data. Existing biclustering algorithms for binary data often struggle to recover biclustering patterns under noise, face scalability issues, and/or bias the final results towards biclusters of a particular size or characteristic. We propose a Bayesian method for biclustering binary datasets called Binary Spike-and-Slab Lasso Biclustering (BiSSLB). Our method is robust to noise and allows for overlapping biclusters of various sizes without prior knowledge of the noise level or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Gene expression and cancer classification
