A Performance Study of Monitoring and Information Services for   Distributed Systems

Xuehai Zhang; Jeffrey Freschl; and Jennifer M. Schopf

arXiv:cs/0304015·cs.PF·May 23, 2007

A Performance Study of Monitoring and Information Services for Distributed Systems

Xuehai Zhang, Jeffrey Freschl, and Jennifer M. Schopf

PDF

Open Access

TL;DR

This study compares the performance of three distributed system monitoring services, revealing their scalability limits and the importance of caching and network topology in optimizing performance.

Contribution

It provides a quantitative performance analysis of three monitoring services, highlighting their different behaviors and design considerations for scalability.

Findings

01

Caching and prefetching improve performance

02

Primary components should be at well-connected sites

03

Different services have distinct scalability characteristics

Abstract

Monitoring and information services form a key component of a distributed system, or Grid. A quantitative study of such services can aid in understanding the performance limitations, advise in the deployment of the systems, and help evaluate future development work. To this end, we study the performance of three monitoring and information services for distributed systems: the Globus Toolkit's Monitoring and Discovery Service (MDS), the European Data Grid Relational Grid Monitoring Architecture (R-GMA), and Hawkeye, part of the Condor project. We perform experiments to test their scalability with respect to number of users, number of resources, and amount of data collected. Our study shows that each approach has different behaviors, often due to their different design goals. In the four sets of experiments we conducted to evaluate the performance of the service components under different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed and Parallel Computing Systems · Peer-to-Peer Network Technologies · Cloud Computing and Resource Management