A General Hybrid Clustering Technique
Saeid Amiri, Bertrand Clarke, Jennifer Clarke, Hoyt A. Koepke

TL;DR
This paper introduces a versatile hybrid clustering method capable of handling complex, non-convex data structures through a multi-stage process involving initial over-segmentation, stabilization via dendrograms, and pruning to achieve the desired number of clusters.
Contribution
The paper presents a novel hybrid clustering approach that effectively manages non-convex clusters and estimates the true number of clusters, with theoretical justification and empirical validation.
Findings
Outperforms existing methods on real and simulated data
Effectively estimates the true number of clusters
Handles non-convex and complex cluster shapes
Abstract
Here, we propose a clustering technique for general clustering problems including those that have non-convex clusters. For a given desired number of clusters , we use three stages to find a clustering. The first stage uses a hybrid clustering technique to produce a series of clusterings of various sizes (randomly selected). They key steps are to find a -means clustering using clusters where and then joins these small clusters by using single linkage clustering. The second stage stabilizes the result of stage one by reclustering via the `membership matrix' under Hamming distance to generate a dendrogram. The third stage is to cut the dendrogram to get clusters where and then prune back to to give a final clustering. A variant on our technique also gives a reasonable estimate for , the true number of clusters. We provide a series…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
