Writing summary for the state-of-the-art methods for big data clustering   in distributed environment

Dipesh Gyawali

arXiv:2211.05339·cs.DC·November 11, 2022

Writing summary for the state-of-the-art methods for big data clustering in distributed environment

Dipesh Gyawali

PDF

Open Access

TL;DR

This paper reviews recent big data clustering methods in distributed environments, highlighting their strengths and weaknesses to guide future research and application in handling diverse data types.

Contribution

It provides a comprehensive summary of state-of-the-art big data clustering techniques in distributed systems, emphasizing their advantages and limitations.

Findings

01

Summarizes recent clustering techniques and their characteristics.

02

Identifies strengths and weaknesses of various methods.

03

Guides future research directions in big data clustering.

Abstract

Big Data processing systems handle huge unstructured and structured data to store, process, and analyze through cluster analysis which helps in identifying unseen patterns to find the relationships between them. Clustering analysis over the shared machines in big data technologies helps in deriving the relations and making decisions using data in context. It can handle every form of raw, tabular data along with structured, semi-structured, and unstructured data. The data doesn't have to possess linearity property. It can reflect associative and correlative patterns and groupings. The main contribution and findings of this paper are to gather and summarize the recent big data clustering techniques, and their strengths, and weaknesses in any distributed environment.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Clustering Algorithms Research