Clustering high dimensional data using subspace and projected clustering   algorithms

Rahmat Widia Sembiring; Jasni Mohamad Zain; Abdullah Embong

arXiv:1009.0384·cs.DB·September 3, 2010

Clustering high dimensional data using subspace and projected clustering algorithms

Rahmat Widia Sembiring, Jasni Mohamad Zain, Abdullah Embong

PDF

TL;DR

This paper compares three high-dimensional clustering algorithms—PROCLUS, P3C, and STATPC—evaluating their performance in terms of speed, accuracy, and cluster quality in subspace and projected clustering.

Contribution

It provides a detailed experimental comparison of three algorithms for subspace and projected clustering in high-dimensional data.

Findings

01

PROCLUS is fastest with fewer unclustered data points.

02

STATPC achieves higher accuracy in cluster points and attribute relevance.

03

All algorithms have different strengths depending on the metric considered.

Abstract

Problem statement: Clustering has a number of techniques that have been developed in statistics, pattern recognition, data mining, and other fields. Subspace clustering enumerates clusters of objects in all subspaces of a dataset. It tends to produce many over lapping clusters. Approach: Subspace clustering and projected clustering are research areas for clustering in high dimensional spaces. In this research we experiment three clustering oriented algorithms, PROCLUS, P3C and STATPC. Results: In general, PROCLUS performs better in terms of time of calculation and produced the least number of un-clustered data while STATPC outperforms PROCLUS and P3C in the accuracy of both cluster points and relevant attributes found. Conclusions/Recommendations: In this study, we analyze in detail the properties of different data clustering method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.