# Faster DB-scan and HDB-scan in Low-Dimensional Euclidean Spaces

**Authors:** Mark de Berg, Ade Gunawan, Marcel Roeloffzen

arXiv: 1702.08607 · 2017-03-01

## TL;DR

This paper introduces a new $O(n\,log n)$ algorithm for DBscan in 2D that is faster and less sensitive to parameters, along with an efficient approach for HDBscan and its approximation in fixed dimensions.

## Contribution

It presents the first $O(n\,log n)$ algorithms for DBscan and HDBscan in low-dimensional Euclidean spaces, improving efficiency and robustness.

## Key findings

- The new DBscan algorithm is theoretically optimal in 2D.
- The simplified algorithm performs well in practice.
- The HDBscan algorithm is also optimized to $O(n\,log n)$ in the plane.

## Abstract

We present a new algorithm for the widely used density-based clustering method DBscan. Our algorithm computes the DBscan-clustering in $O(n\log n)$ time in $\mathbb{R}^2$, irrespective of the scale parameter $\varepsilon$ (and assuming the second parameter MinPts is set to a fixed constant, as is the case in practice). Experiments show that the new algorithm is not only fast in theory, but that a slightly simplified version is competitive in practice and much less sensitive to the choice of $\varepsilon$ than the original DBscan algorithm. We also present an $O(n\log n)$ randomized algorithm for HDBscan in the plane---HDBscan is a hierarchical version of DBscan introduced recently---and we show how to compute an approximate version of HDBscan in near-linear time in any fixed dimension.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1702.08607/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1702.08607/full.md

## References

21 references — full list in the complete paper: https://tomesphere.com/paper/1702.08607/full.md

---
Source: https://tomesphere.com/paper/1702.08607