Model Based Co-clustering of Mixed Numerical and Binary Data

Aichetou Bouchareb (SAMM); Marc Boull\'e; Fabrice Cl\'erot; Fabrice; Rossi (CEREMADE)

arXiv:2212.11725·cs.LG·December 23, 2022

Model Based Co-clustering of Mixed Numerical and Binary Data

Aichetou Bouchareb (SAMM), Marc Boull\'e, Fabrice Cl\'erot, Fabrice, Rossi (CEREMADE)

PDF

TL;DR

This paper introduces a novel co-clustering method for mixed numerical and binary data using extended latent block models, demonstrating its effectiveness through simulations and discussing its advantages and limitations.

Contribution

It extends latent block models to handle mixed data types, filling a gap in co-clustering methods for combined numerical and binary datasets.

Findings

01

Effective co-clustering on simulated mixed data

02

Advantages include improved block detection in mixed datasets

03

Potential limitations discussed in the context of real data

Abstract

Co-clustering is a data mining technique used to extract the underlying block structure between the rows and columns of a data matrix. Many approaches have been studied and have shown their capacity to extract such structures in continuous, binary or contingency tables. However, very little work has been done to perform co-clustering on mixed type data. In this article, we extend the latent block models based co-clustering to the case of mixed data (continuous and binary variables). We then evaluate the effectiveness of the proposed approach on simulated data and we discuss its advantages and potential limits.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.