HCA-DBSCAN: HyperCube Accelerated Density Based Spatial Clustering for Applications with Noise
Vinayak Mathur, Jinesh Mehta, Sanjay Singh

TL;DR
This paper introduces HCA-DBSCAN, a new accelerated clustering algorithm that overlays data with grids and uses representative points, significantly improving DBSCAN's speed while maintaining its ability to find arbitrary-shaped clusters.
Contribution
The paper presents HCA-DBSCAN, a novel grid-based acceleration method that reduces computational complexity and speeds up DBSCAN clustering by up to 58.27%.
Findings
Achieves up to 58.27% faster runtime compared to existing methods.
Effectively identifies clusters of arbitrary shapes.
Reduces the number of comparisons using representative points.
Abstract
Density-based clustering has found numerous applications across various domains. The Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm is capable of finding clusters of varied shapes that are not linearly separable, at the same time it is not sensitive to outliers in the data. Combined with the fact that the number of clusters in the data are not required apriori makes DBSCAN really powerfully. Slower performance (O(n2)) limits its applications. In this work, we present a new clustering algorithm, the HyperCube Accelerated DBSCAN(HCA-DBSCAN) which uses a combination of distance-based aggregation by overlaying the data with customized grids. We use representative points to reduce the number of comparisons that need to be computed. Experimental results show that the proposed algorithm achieves a significant run time speedup of up to 58.27% when compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Clustering Algorithms Research · Human Mobility and Location-Based Analysis
