Neural Scaling Laws for Boosted Jet Tagging
Matthias Vigl, Nicole Hartman, Michael Kagan, Lukas Heinrich

TL;DR
This paper investigates neural scaling laws for boosted jet classification in high energy physics, demonstrating how increased compute and feature complexity can enhance performance and approach theoretical limits.
Contribution
It derives compute optimal scaling laws for boosted jet tagging and explores how data repetition and feature choice influence performance limits.
Findings
Scaling laws guide performance improvements with increased compute.
Lower-level features can raise the asymptotic performance limit.
Data repetition effectively increases dataset size in HEP applications.
Abstract
The success of Large Language Models (LLMs) has established that scaling compute, through joint increases in model capacity and dataset size, is the primary driver of performance in modern machine learning. While machine learning has long been an integral component of High Energy Physics (HEP) data analysis workflows, the compute used to train state-of-the-art HEP models remains orders of magnitude below that of industry foundation models. With scaling laws only beginning to be studied in the field, we investigate neural scaling laws for boosted jet classification using the public JetClass dataset. We derive compute optimal scaling laws and identify an effective performance limit that can be consistently approached through increased compute. We study how data repetition, common in HEP where simulation is expensive, modifies the scaling yielding a quantifiable effective dataset size…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Computational Physics and Python Applications · Adversarial Robustness in Machine Learning
