Density based Spatial Clustering of Lines via Probabilistic Generation   of Neighbourhood

Akanksha Das; Malay Bhattacharyya

arXiv:2410.02290·cs.LG·October 4, 2024

Density based Spatial Clustering of Lines via Probabilistic Generation of Neighbourhood

Akanksha Das, Malay Bhattacharyya

PDF

Open Access

TL;DR

This paper introduces a novel density-based clustering algorithm for lines in high-dimensional spaces that handles outliers, missing data, and domain knowledge, with applications demonstrated on synthetic and real-world datasets.

Contribution

It generalizes density-based clustering to lines in high-dimensional spaces using a probabilistic neighborhood approach, addressing the lack of a valid distance measure for lines.

Findings

01

Effective noise and outlier detection in clustering

02

Ability to cluster incomplete high-dimensional data

03

Successful application to real-world datasets like rail and road networks

Abstract

Density based spatial clustering of points in $R^{n}$ has a myriad of applications in a variety of industries. We generalise this problem to the density based clustering of lines in high-dimensional spaces, keeping in mind there exists no valid distance measure that follows the triangle inequality for lines. In this paper, we design a clustering algorithm that generates a customised neighbourhood for a line of a fixed volume (given as a parameter), based on an optional parameter as a continuous probability density function. This algorithm is not sensitive to the outliers and can effectively identify the noise in the data using a cardinality parameter. One of the pivotal applications of this algorithm is clustering data points in $R^{n}$ with missing entries, while utilising the domain knowledge of the respective data. In particular, the proposed algorithm is able to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Clustering Algorithms Research