HL-LHC Computing Review Stage 2, Common Software Projects: Data Science Tools for Analysis
Jim Pivarski, Eduardo Rodrigues, Kevin Pedro, Oksana Shadura, Benjamin, Krikler, Graeme A. Stewart

TL;DR
This paper reviews the adoption of Python and data science tools in High-Energy Physics, providing insights and recommendations for their integration into HL-LHC computing workflows.
Contribution
It offers a comprehensive overview of current data science tools in HEP and proposes strategic actions for their future adoption in HL-LHC computing.
Findings
Python is increasingly adopted in HEP analysis workflows.
Data science tools enhance analysis efficiency and reproducibility.
Recommendations aim to guide community adoption and development.
Abstract
This paper was prepared by the HEP Software Foundation (HSF) PyHEP Working Group as input to the second phase of the LHCC review of High-Luminosity LHC (HL-LHC) computing, which took place in November, 2021. It describes the adoption of Python and data science tools in HEP, discusses the likelihood of future scenarios, and recommendations for action by the HEP community.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Scientific Computing and Data Management · Particle physics theoretical and experimental studies
