Penalized Splines for Smooth Representation of High-dimensional Monte Carlo Datasets
Nathan Whitehorn, Jakob van Santen, Sven Lafebre

TL;DR
This paper introduces a penalized spline method to efficiently represent high-dimensional Monte Carlo datasets in high-energy physics, simplifying data handling and analysis tasks such as maximum-likelihood fitting.
Contribution
It presents a novel application of penalized splines for compactly representing large Monte Carlo histograms, improving storage and interpolation accuracy.
Findings
Efficient B-spline representations reduce data storage requirements.
Spline fits improve numerical stability in data interpolation.
Method facilitates more accurate maximum-likelihood analysis.
Abstract
Detector response to a high-energy physics process is often estimated by Monte Carlo simulation. For purposes of data analysis, the results of this simulation are typically stored in large multi-dimensional histograms, which can quickly become both too large to easily store and manipulate and numerically problematic due to unfilled bins or interpolation artifacts. We describe here an application of the penalized spline technique to efficiently compute B-spline representations of such tables and discuss aspects of the resulting B-spline fits that simplify many common tasks in handling tabulated Monte Carlo data in high-energy physics analysis, in particular their use in maximum-likelihood fitting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
