Coffea -- Columnar Object Framework For Effective Analysis
Nicholas Smith, Lindsey Gray, Matteo Cremonesi, Bo Jayatilaka, Oliver, Gutsche, Allison Hall, Kevin Pedro, Maria Acosta, Andrew Melo, Stefano, Belforte, Jim Pivarski

TL;DR
The coffea framework introduces a columnar approach to High-Energy Physics analysis using Python, enhancing scalability, reproducibility, and efficiency by leveraging modern data technologies and a modular design.
Contribution
It presents a novel, Python-based, columnar analysis framework that separates analysis logic from data delivery, improving flexibility and performance in HEP data analysis.
Findings
Improved analysis time-to-insight and scalability.
Enhanced reproducibility and portability of analysis workflows.
Successful implementation of CMS data analysis using coffea.
Abstract
The coffea framework provides a new approach to High-Energy Physics analysis, via columnar operations, that improves time-to-insight, scalability, portability, and reproducibility of analysis. It is implemented with the Python programming language, the scientific python package ecosystem, and commodity big data technologies. To achieve this suite of improvements across many use cases, coffea takes a factorized approach, separating the analysis implementation and data delivery scheme. All analysis operations are implemented using the NumPy or awkward-array packages which are wrapped to yield user code whose purpose is quickly intuited. Various data delivery schemes are wrapped into a common front-end which accepts user inputs and code, and returns user defined outputs. We will discuss our experience in implementing analysis of CMS data using the coffea framework along with a discussion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParticle physics theoretical and experimental studies · Computational Physics and Python Applications · Astrophysics and Cosmic Phenomena
