Survey on Distributed Data Mining in P2P Networks
Rekha Sunny T, Sabu M. Thampi

TL;DR
This survey reviews the evolution, architectures, approaches, and challenges of Distributed Data Mining in P2P networks, emphasizing its importance for handling large, decentralized, and privacy-sensitive data in various applications.
Contribution
It provides a comprehensive overview of DDM in P2P systems, highlighting recent developments, taxonomy, and key issues in this emerging research area.
Findings
DDM addresses data decentralization and privacy concerns.
Various architectures and approaches exist for P2P data mining.
Challenges include scalability, data heterogeneity, and security.
Abstract
The exponential increase of availability of digital data and the necessity to process it in business and scientific fields has literally forced upon us the need to analyze and mine useful knowledge from it. Traditionally data mining has used a data warehousing model of gathering all data into a central site, and then running an algorithm upon that data. Such a centralized approach is fundamentally inappropriate due to many reasons like huge amount of data, infeasibility to centralize data stored at multiple sites, bandwidth limitation and privacy concerns. To solve these problems, Distributed Data Mining (DDM) has emerged as a hot research area. Careful attention in the usage of distributed resources of data, computing, communication, and human factors in a near optimal fashion are paid by distributed data mining. DDM is gaining attention in peer-to-peer (P2P) systems which are emerging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Peer-to-Peer Network Technologies · Data Management and Algorithms
