Distributed Metadata with the AMGA Metadata Catalog
Nuno Santos, Birger Koblitz

TL;DR
This paper discusses the design and implementation of distributed metadata catalog services in large-scale Data Grids, focusing on scalability, performance, and fault-tolerance improvements through replication and distribution mechanisms.
Contribution
It introduces novel replication and distribution mechanisms integrated into the AMGA Metadata Catalog, enhancing scalability and fault-tolerance without relying on specific database back-ends.
Findings
Improved scalability and fault-tolerance of metadata catalogs
Database-independent replication mechanisms
Enhanced performance in distributed Data Grid environments
Abstract
Catalog Services play a vital role on Data Grids by allowing users and applications to discover and locate the data needed. On large Data Grids, with hundreds of geographically distributed sites, centralized Catalog Services do not provide the required scalability, performance or fault-tolerance. In this article, we start by presenting and discussing the general requirements on Grid Catalogs of applications being developed by the EGEE user community. This provides the motivation for the second part of the article, where we present the replication and distribution mechanisms we have designed and implemented into the AMGA Metadata Catalog, which is part of the gLite software stack being developed for the EGEE project. Implementing these mechanisms in the catalog itself has the advantages of not requiring any special support from the relational database back-end, of being database…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Advanced Data Storage Technologies · Distributed systems and fault tolerance
