Using Gaussian Measures for Efficient Constraint Based Clustering
Chandrima Sarkar, Atanu Roy

TL;DR
This paper introduces a new high-dimensional clustering method that combines CF trees with Gaussian density constraints to improve cluster quality and overcome limitations of traditional hierarchical clustering.
Contribution
The paper proposes a novel iterative multiphase clustering algorithm using Gaussian measures on CF trees, enhancing cluster refinement and flexibility over existing methods.
Findings
Improved cluster quality through Gaussian-based refinement.
Overcomes hierarchical clustering limitations like non-reversibility.
Effective for need-driven high-dimensional data analysis.
Abstract
In this paper we present a novel iterative multiphase clustering technique for efficiently clustering high dimensional data points. For this purpose we implement clustering feature (CF) tree on a real data set and a Gaussian density distribution constraint on the resultant CF tree. The post processing by the application of Gaussian density distribution function on the micro-clusters leads to refinement of the previously formed clusters thus improving their quality. This algorithm also succeeds in overcoming the inherent drawbacks of conventional hierarchical methods of clustering like inability to undo the change made to the dendogram of the data points. Moreover, the constraint measure applied in the algorithm makes this clustering technique suitable for need driven data analysis. We provide veracity of our claim by evaluating our algorithm with other similar clustering algorithms.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Data Management and Algorithms · Face and Expression Recognition
