Scalable Global Grid catalogue for LHC Run3 and beyond
M Martinez Pedreira, C Grigoras (for the ALICE Collaboration)

TL;DR
This paper discusses enhancing the scalability and performance of the AliEn file catalogue for the LHC by evaluating new backend solutions like distributed key-value stores, ensuring efficient growth and access.
Contribution
It introduces architectural improvements and evaluates alternative backend technologies to improve the scalability of the global grid catalogue.
Findings
Distributed key-value stores outperform relational databases in scalability.
Schema simplification maintains functionality while enhancing performance.
Benchmark results support adoption of new backend solutions.
Abstract
The AliEn (ALICE Environment) file catalogue is a global unique namespace providing mapping between a UNIX-like logical name structure and the corresponding physical files distributed over 80 storage elements worldwide. Powerful search tools and hierarchical metadata information are integral parts of the system and are used by the Grid jobs as well as local users to store and access all files on the Grid storage elements. The catalogue has been in production since 2005 and over the past 11 years has grown to more than 2 billion logical file names. The backend is a set of distributed relational databases, ensuring smooth growth and fast access. Due to the anticipated fast future growth, we are looking for ways to enhance the performance and scalability by simplifying the catalogue schema while keeping the functionality intact. We investigated different backend solutions, such as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
