# Batch Incremental Shared Nearest Neighbor Density Based Clustering   Algorithm for Dynamic Datasets

**Authors:** Panthadeep Bhattacharjee, Amit Awekar

arXiv: 1701.09049 · 2017-02-02

## TL;DR

This paper introduces a batch incremental clustering algorithm for dynamic datasets that efficiently handles insertions and deletions, significantly improving speed over existing methods while maintaining identical clustering results.

## Contribution

The proposed algorithm extends shared nearest neighbor density clustering to support batch updates and deletions, overcoming limitations of previous incremental methods.

## Key findings

- Up to 10,000 times faster than SNND
- Requires up to 60% additional memory
- Produces identical clustering output to SNND

## Abstract

Incremental data mining algorithms process frequent updates to dynamic datasets efficiently by avoiding redundant computation. Existing incremental extension to shared nearest neighbor density based clustering (SNND) algorithm cannot handle deletions to dataset and handles insertions only one point at a time. We present an incremental algorithm to overcome both these bottlenecks by efficiently identifying affected parts of clusters while processing updates to dataset in batch mode. We show effectiveness of our algorithm by performing experiments on large synthetic as well as real world datasets. Our algorithm is up to four orders of magnitude faster than SNND and requires up to 60% extra memory than SNND while providing output identical to SNND.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1701.09049/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1701.09049/full.md

## References

4 references — full list in the complete paper: https://tomesphere.com/paper/1701.09049/full.md

---
Source: https://tomesphere.com/paper/1701.09049