Scalable Density-Based Distributed Clustering
Eshref Januzaj, Hans-Peter Kriegel, Martin Pfeifle

TL;DR
This paper introduces a scalable distributed clustering algorithm that balances clustering quality and communication cost by selecting local representatives for global density-based clustering, suitable for large heterogeneous datasets.
Contribution
It proposes a novel scalable density-based distributed clustering method that efficiently selects local representatives to improve global clustering performance and reduce data transmission.
Findings
High-quality clusterings achieved with scalable transmission costs
Efficient local representative selection process
Effective global clustering on large heterogeneous data
Abstract
Clustering has become an increasingly important task in analysing huge amounts of data. Traditional applications require that all data has to be located at the site where it is scrutinized. Nowadays, large amounts of heterogeneous, complex data reside on different, independently working computers which are connected to each other via local or wide area networks. In this paper, we propose a scalable density-based distributed clustering algorithm which allows a user-defined trade-off between clustering quality and the number of transmitted objects from the different local sites to a global server site. Our approach consists of the following steps: First, we order all objects located at a local site according to a quality criterion reflecting their suitability to serve as local representatives. Then we send the best of these representatives to a server site where they are clustered with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Caching and Content Delivery · Peer-to-Peer Network Technologies
