Analysis Cyberinfrastructure: Challenges and Opportunities
Kevin Lannon, Paul Brenner, Mike Hildreth, Kenyi Hurtado Anampa, Alan, Malta Rodrigues, Kelci Mohrman, Doug Thain, Benjamin Tovar

TL;DR
This paper discusses the current state and future challenges of analysis cyberinfrastructure in High Energy Physics, focusing on software and hardware used for late-stage data analysis, and highlights potential research directions.
Contribution
It provides a reflection on recent experiences with a Python-based analysis framework and explores future R&D topics for analysis cyberinfrastructure in the High-Luminosity LHC era.
Findings
Identified challenges in current analysis workflows.
Highlighted the need for R&D in analysis software and hardware.
Suggested future research directions for analysis cyberinfrastructure.
Abstract
Analysis cyberinfrastructure refers to the combination of software and computer hardware used to support late-stage data analysis in High Energy Physics (HEP). For the purposes of this white paper, late-stage data analysis refers specifically to the step of transforming the most reduced common data format produced by a given experimental collaboration (for example, nanoAOD for the CMS experiment) into histograms. In this white paper, we reflect on observations gathered from a recent experience with data analysis using a recent, python-based analysis framework, and extrapolate these experiences though the High-Luminosity LHC era as way of highlighting potential R\&D topics in analysis cyberinfrastructure.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParticle physics theoretical and experimental studies · Distributed and Parallel Computing Systems · Computational Physics and Python Applications
