Replication in Data Grids: Metrics and Strategies
Tarek Hamrouni

TL;DR
This paper analyzes how replica distribution affects data grid performance, introduces new evaluation metrics, and proposes a data mining-based replication strategy with an algorithm for mining correlated patterns.
Contribution
It provides a comprehensive overview of replication strategies, introduces new metrics for distribution quality, and develops a novel data mining-based replication algorithm.
Findings
Distribution quality significantly impacts performance.
New metrics effectively evaluate distribution quality.
Data mining techniques can enhance replication strategies.
Abstract
We focus in this report on two main axes. The first is dedicated to the study of the effect of replicas distribution on data grid performances. In this respect, our main contributions are as follows: 1) An overview of replication strategies mainly from the viewpoints of the considered parameters in their associated steps as well as the used metrics in the literature for their evaluation. 2) A study of the impact of placement strategies on data grid performance which motivated the analysis of the effect of the replicas distribution quality on the performance results of replication strategies. 3) The proposal of new evaluation metrics dedicated to the evaluation of the distribution quality. 4) The setting of an objective evaluation of replication strategies which is based on a beforehand assessment of the distribution quality. The second axis is mainly dedicated to exploiting results of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Algorithms and Data Compression · Data Mining Algorithms and Applications
