Scaling Laws in Jet Classification
Joshua Batson, Yonatan Kahn

TL;DR
This paper uncovers scaling laws in jet classification, showing that classifier performance follows power-law behavior with dataset size, emphasizing the importance of dataset scaling in model comparison.
Contribution
It reveals the existence of power-law scaling in jet classification and highlights the significance of dataset size in evaluating classifier performance.
Findings
Six classifiers exhibit power-law scaling of test loss.
Optimal classifier choice varies with dataset size.
Scaling laws are similar to those in language and image models.
Abstract
We demonstrate the emergence of scaling laws in the benchmark top versus QCD jet classification problem in collider physics. Six distinct physically-motivated classifiers exhibit power-law scaling of the binary cross-entropy test loss as a function of training set size, with distinct power law indices. This result highlights the importance of comparing classifiers as a function of dataset size rather than for a fixed training set, as the optimal classifier may change considerably as the dataset is scaled up. We speculate on the interpretation of our results in terms of previous models of scaling laws observed in natural language and image datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Generative Adversarial Networks and Image Synthesis · Particle physics theoretical and experimental studies
MethodsSparse Evolutionary Training
