Event Indexing Systems for Efficient Selection and Analysis of HERA Data
L.A.T. Bauerdick, Adrian Fox-Murphy, Tobias Haas, Stefan Stonjek,, Enrico Tassi

TL;DR
This paper presents two software systems designed to enhance offline analysis of large-scale event data from the ZEUS detector at HERA, significantly improving data access speed and sample selection efficiency.
Contribution
It introduces and compares two novel event indexing systems, one based on directories and the other on a tag database, for efficient data retrieval in high-energy physics experiments.
Findings
Achieved up to 20-fold reduction in data retrieval time.
Both systems enable quick access to individual events in multi-terabyte datasets.
The tag database offers flexible event selection and substantial efficiency gains.
Abstract
The design and implementation of two software systems introduced to improve the efficiency of offline analysis of event data taken with the ZEUS Detector at the HERA electron-proton collider at DESY are presented. Two different approaches were made, one using a set of event directories and the other using a tag database based on a commercial object-oriented database management system. These are described and compared. Both systems provide quick direct access to individual collision events in a sequential data store of several terabytes, and they both considerably improve the event analysis efficiency. In particular the tag database provides a very flexible selection mechanism and can dramatically reduce the computing time needed to extract small subsamples from the total event sample. Gains as large as a factor 20 have been obtained.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
