On Data-Independent Properties for Density-Based Dissimilarity Measures in Hybrid Clustering
Kajsa M{\o}llersen, Subhra S. Dhar, Fred Godtliebsen

TL;DR
This paper introduces six data-independent properties to evaluate density-based dissimilarity measures in hybrid clustering, highlighting the limitations of existing measures and proposing a new measure that satisfies all properties for improved clustering performance.
Contribution
The paper proposes six novel data-independent properties for density-based dissimilarity measures and introduces a new measure based on Kullback-Leibler divergence that satisfies all these properties.
Findings
Existing measures do not satisfy all proposed properties.
The new Kullback-Leibler based measure satisfies all properties.
Properties improve the selection and effectiveness of dissimilarity measures in hybrid clustering.
Abstract
Hybrid clustering combines partitional and hierarchical clustering for computational effectiveness and versatility in cluster shape. In such clustering, a dissimilarity measure plays a crucial role in the hierarchical merging. The dissimilarity measure has great impact on the final clustering, and data-independent properties are needed to choose the right dissimilarity measure for the problem at hand. Properties for distance-based dissimilarity measures have been studied for decades, but properties for density-based dissimilarity measures have so far received little attention. Here, we propose six data-independent properties to evaluate density-based dissimilarity measures associated with hybrid clustering, regarding equality, orthogonality, symmetry, outlier and noise observations, and light-tailed models for heavy-tailed clusters. The significance of the properties is investigated,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
