HBIC: A Biclustering Algorithm for Heterogeneous Datasets
Ad\'an Jos\'e-Garc\'ia, Julie Jacques, Cl\'ement Chauvet, Vincent, Sobanski, Clarisse Dhaenens

TL;DR
HBIC is a novel biclustering algorithm designed for heterogeneous datasets with mixed data types, enabling the discovery of meaningful biclusters in complex real-world data.
Contribution
The paper introduces HBIC, a biclustering method capable of handling numeric, binary, and categorical data, with a two-stage process for bicluster generation and selection.
Findings
Successfully identified high-quality biclusters in synthetic benchmarks.
Demonstrated effectiveness on biomedical clinical data.
Outperformed existing biclustering approaches on heterogeneous datasets.
Abstract
Biclustering is an unsupervised machine-learning approach aiming to cluster rows and columns simultaneously in a data matrix. Several biclustering algorithms have been proposed for handling numeric datasets. However, real-world data mining problems often involve heterogeneous datasets with mixed attributes. To address this challenge, we introduce a biclustering approach called HBIC, capable of discovering meaningful biclusters in complex heterogeneous data, including numeric, binary, and categorical data. The approach comprises two stages: bicluster generation and bicluster model selection. In the initial stage, several candidate biclusters are generated iteratively by adding and removing rows and columns based on the frequency of values in the original matrix. In the second stage, we introduce two approaches for selecting the most suitable biclusters by considering their size and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
