# Scalable Global Grid catalogue for LHC Run3 and beyond

**Authors:** M Martinez Pedreira, C Grigoras (for the ALICE Collaboration)

arXiv: 1704.05272 · 2019-08-13

## TL;DR

This paper discusses enhancing the scalability and performance of the AliEn file catalogue for the LHC by evaluating new backend solutions like distributed key-value stores, ensuring efficient growth and access.

## Contribution

It introduces architectural improvements and evaluates alternative backend technologies to improve the scalability of the global grid catalogue.

## Key findings

- Distributed key-value stores outperform relational databases in scalability.
- Schema simplification maintains functionality while enhancing performance.
- Benchmark results support adoption of new backend solutions.

## Abstract

The AliEn (ALICE Environment) file catalogue is a global unique namespace providing mapping between a UNIX-like logical name structure and the corresponding physical files distributed over 80 storage elements worldwide. Powerful search tools and hierarchical metadata information are integral parts of the system and are used by the Grid jobs as well as local users to store and access all files on the Grid storage elements. The catalogue has been in production since 2005 and over the past 11 years has grown to more than 2 billion logical file names. The backend is a set of distributed relational databases, ensuring smooth growth and fast access. Due to the anticipated fast future growth, we are looking for ways to enhance the performance and scalability by simplifying the catalogue schema while keeping the functionality intact. We investigated different backend solutions, such as distributed key value stores, as replacement for the relational database. This contribution covers the architectural changes in the system, together with the technology evaluation, benchmark results and conclusions.

---
Source: https://tomesphere.com/paper/1704.05272