TNStream: Applying Tightest Neighbors to Micro-Clusters to Define Multi-Density Clusters in Streaming Data
Qifen Zeng, Haomin Bao, Yuanzhuo Hu, Zirui Zhang, Yuheng Zheng, and Luosheng Wen

TL;DR
TNStream is a novel online clustering algorithm for streaming data that adaptively identifies multi-density clusters using Tightest Neighbors and Locality-Sensitive Hashing, improving clustering quality in complex data streams.
Contribution
The paper introduces TNStream, a fully online clustering method based on Tightest Neighbors and Skeleton Set theory, addressing multi-density and high-dimensional data challenges.
Findings
TNStream effectively handles multi-density data streams.
Experimental results show improved clustering quality over existing methods.
LSH enhances efficiency in high-dimensional clustering scenarios.
Abstract
In data stream clustering, systematic theory of stream clustering algorithms remains relatively scarce. Recently, density-based methods have gained attention. However, existing algorithms struggle to simultaneously handle arbitrarily shaped, multi-density, high-dimensional data while maintaining strong outlier resistance. Clustering quality significantly deteriorates when data density varies complexly. This paper proposes a clustering algorithm based on the novel concept of Tightest Neighbors and introduces a data stream clustering theory based on the Skeleton Set. Based on these theories, this paper develops a new method, TNStream, a fully online algorithm. The algorithm adaptively determines the clustering radius based on local similarity, summarizing the evolution of multi-density data streams in micro-clusters. It then applies a Tightest Neighbors-based clustering algorithm to form…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
