An array-oriented Python interface for FastJet
Aryan Roy, Jim Pivarski, Chad Wells Freer

TL;DR
This paper introduces a new array-oriented Python interface for FastJet, enabling more efficient and Pythonic analysis of high-energy physics data, and demonstrates significant performance improvements in analysis workflows.
Contribution
The authors developed fastjet, a Python package that adds an array-oriented interface to FastJet, improving efficiency and integration with scientific Python tools in HEP analysis.
Findings
Accelerated HEP analysis code by a factor of 20.
Seamless integration with Scikit-HEP ecosystem libraries.
Enhanced interoperability with machine learning tools.
Abstract
Analysis on HEP data is an iterative process in which the results of one step often inform the next. In an exploratory analysis, it is common to perform one computation on a collection of events, then view the results (often with histograms) to decide what to try next. Awkward Array is a Scikit-HEP Python package that enables data analysis with array-at-a-time operations to implement cuts as slices, combinatorics as composable functions, etc. However, most C++ HEP libraries, such as FastJet, have an imperative, one-particle-at-a-time interface, which would be inefficient in Python and goes against the grain of the array-at-a-time logic of scientific Python. Therefore, we developed fastjet, a pip-installable Python package that provides FastJet C++ binaries, the classic (particle-at-a-time) Python interface, and the new array-oriented interface for use with Awkward Array. The new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Computational Physics and Python Applications · Scientific Computing and Data Management
