ColliderML: The First Release of an OpenDataDetector High-Luminosity Physics Benchmark Dataset
Do\u{g}a Elitez, Paul Gessinger, Daniel Murnane, Marcus Selchou Raaholt, Andreas Salzburger, Stine Kofoed Skov, Andreas Stefl, Anna Zaborowska

TL;DR
ColliderML is a comprehensive, open dataset of simulated proton-proton collision events at the LHC, designed to advance machine learning research in high-energy physics with realistic, high-luminosity data.
Contribution
It introduces the first large-scale, open, experiment-agnostic dataset for HL-LHC conditions, including detailed simulation, digitisation, and reconstruction for ML applications.
Findings
Provides one million simulated events covering Standard Model and BSM processes.
Includes extensive single-particle samples with realistic pile-up overlay.
Initial collider physics benchmarks demonstrate dataset's utility.
Abstract
We introduce ColliderML - a large, open, experiment-agnostic dataset of fully simulated and digitised proton-proton collisions in High-Luminosity Large Hadron Collider conditions ( TeV, mean pile-up ). ColliderML provides one million events across ten Standard Model and Beyond Standard Model processes, plus extensive single-particle samples, all produced with modern next-to-leading order matrix element calculation and showering, realistic per-event pile-up overlay, a validated OpenDataDetector geometry, and standard reconstructions. The release fills a major gap for machine learning (ML) research on detector-level data, provided on the ML-friendly Hugging Face platform. We present physics coverage and the generation, simulation, digitisation and reconstruction pipeline, describe format and access, and initial collider physics benchmarks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParticle physics theoretical and experimental studies · High-Energy Particle Collisions Research · Particle Detector Development and Performance
