A Performance Study of Monitoring and Information Services for Distributed Systems
Xuehai Zhang, Jeffrey Freschl, and Jennifer M. Schopf

TL;DR
This study compares the performance of three distributed system monitoring services, revealing their scalability limits and the importance of caching and network topology in optimizing performance.
Contribution
It provides a quantitative performance analysis of three monitoring services, highlighting their different behaviors and design considerations for scalability.
Findings
Caching and prefetching improve performance
Primary components should be at well-connected sites
Different services have distinct scalability characteristics
Abstract
Monitoring and information services form a key component of a distributed system, or Grid. A quantitative study of such services can aid in understanding the performance limitations, advise in the deployment of the systems, and help evaluate future development work. To this end, we study the performance of three monitoring and information services for distributed systems: the Globus Toolkit's Monitoring and Discovery Service (MDS), the European Data Grid Relational Grid Monitoring Architecture (R-GMA), and Hawkeye, part of the Condor project. We perform experiments to test their scalability with respect to number of users, number of resources, and amount of data collected. Our study shows that each approach has different behaviors, often due to their different design goals. In the four sets of experiments we conducted to evaluate the performance of the service components under different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Peer-to-Peer Network Technologies · Cloud Computing and Resource Management
