Modal Clustering for Categorical Data
Noemi Corsini, Giovanna Menardi

TL;DR
This paper introduces a new clustering approach for categorical data based on high frequency and variable association, adapting modal clustering concepts from continuous data to improve clustering accuracy.
Contribution
It proposes a novel categorical clustering method grounded in modal clustering principles, bridging the gap between continuous and categorical data analysis.
Findings
Effective on real datasets
Validated through simulations
Outperforms existing methods
Abstract
Despite the inherent lack of a ground truth in clustering, a broad consensus is overall acknowledged in defining the concept of cluster in the continuous setting. Conversely, this remains controversial in the presence of categorical data. We propose a novel notion of cluster based on the dual concepts of high frequency and variable association. We show how the concept of high frequency aligns with the cluster notion provided by modal clustering in the continuous setting, which allows us to borrow and adapt existing operational tools to develop a novel procedure. The method is illustrated on some real data and tested via simulations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
