Awkward to RDataFrame and back
Ianna Osborne, Jim Pivarski

TL;DR
This paper introduces a zero-copy conversion method between Awkward Arrays and RDataFrame, enabling flexible, efficient data analysis across different packages and languages with minimal overhead.
Contribution
It presents new functions for on-demand, zero-copy conversion between Awkward Arrays and RDataFrame, leveraging JIT techniques for efficient data handling.
Findings
Conversion is zero-copy and on-demand, avoiding data duplication.
The approach supports flexible analysis workflows combining Awkward Arrays and RDataFrame.
Examples demonstrate practical analysis and data extraction using the new functions.
Abstract
Awkward Arrays and RDataFrame provide two very different ways of performing calculations at scale. By adding the ability to zero-copy convert between them, users get the best of both. It gives users a better flexibility in mixing different packages and languages in their analysis. In Awkward Array version 2, the ak.to_rdataframe function presents a view of an Awkward Array as an RDataFrame source. This view is generated on demand and the data are not copied. The column readers are generated based on the run-time type of the views. The readers are passed to a generated source derived from ROOT::RDF::RDataSource. The ak.from_rdataframe function converts the selected columns as native Awkward Arrays. The details of the implementation exploiting JIT techniques are discussed. The examples of analysis of data stored in Awkward Arrays via a high-level interface of an RDataFrame are presented.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Research and Discoveries · Computational Physics and Python Applications
