Beyond Labels: Advancing Cluster Analysis with the Entropy of Distance Distribution (EDD)
Claus Metzner, Achim Schilling, Patrick Krauss

TL;DR
This paper introduces the Entropy of Distance Distribution (EDD), a novel label-free clustering measure that uses entropy of pairwise distances to effectively quantify and distinguish clustering structures in high-dimensional data.
Contribution
The paper presents EDD, a new entropy-based method for clustering analysis that is invariant to data transformations and does not require labels, improving detection of complex cluster patterns.
Findings
EDD increases with cluster overlap, indicating sensitivity to cluster structure.
EDD is invariant to translation, permutation, and scaling of data.
Experimental results demonstrate EDD's effectiveness in identifying clustering degrees.
Abstract
In the evolving landscape of data science, the accurate quantification of clustering in high-dimensional data sets remains a significant challenge, especially in the absence of predefined labels. This paper introduces a novel approach, the Entropy of Distance Distribution (EDD), which represents a paradigm shift in label-free clustering analysis. Traditional methods, reliant on discrete labels, often struggle to discern intricate cluster patterns in unlabeled data. EDD, however, leverages the characteristic differences in pairwise point-to-point distances to discern clustering tendencies, independent of data labeling. Our method employs the Shannon information entropy to quantify the 'peakedness' or 'flatness' of distance distributions in a data set. This entropy measure, normalized against its maximum value, effectively distinguishes between strongly clustered data (indicated by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Bayesian Methods and Mixture Models · Face and Expression Recognition
MethodsSparse Evolutionary Training
