A Distributional Approach for Soft Clustering Comparison and Evaluation
Andrea Campagner, Davide Ciucci, Thierry Den{\oe}ux

TL;DR
This paper introduces a novel distributional framework for comparing and evaluating soft clustering results, addressing limitations of existing methods by modeling uncertainty and providing a general, mathematically grounded approach.
Contribution
It proposes a new distributional measure for soft clustering comparison based on interpreting soft clusterings as distributions over hard clusterings, with complexity analysis and approximation techniques.
Findings
The approach effectively captures uncertainty in soft clustering results.
The method is mathematically grounded in metric and complexity theory.
An illustrative experiment demonstrates the practicality of the proposed measures.
Abstract
The development of external evaluation criteria for soft clustering (SC) has received limited attention: existing methods do not provide a general approach to extend comparison measures to SC, and are unable to account for the uncertainty represented in the results of SC algorithms. In this article, we propose a general method to address these limitations, grounding on a novel interpretation of SC as distributions over hard clusterings, which we call \emph{distributional measures}. We provide an in-depth study of complexity- and metric-theoretic properties of the proposed approach, and we describe approximation techniques that can make the calculations tractable. Finally, we illustrate our approach through a simple but illustrative experiment.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Complex Network Analysis Techniques · Rough Sets and Fuzzy Logic
