Using CMS Open Data in research -- challenges and directions
Kati Lassila-Perini, Clemens Lange, Edgar Carrera Jarrin, Matthew, Bellis

TL;DR
This paper reviews the CMS open data released from CERN's LHC, discusses challenges in using this data for research, and suggests measures to enhance its usability for scientific studies.
Contribution
It provides a comprehensive overview of available CMS open data, identifies key challenges, and proposes directions to improve data accessibility and usability.
Findings
CMS open data from 2010-2012 is publicly available.
Challenges include data complexity and lack of user-friendly tools.
Recommendations for improving data usability are discussed.
Abstract
The CMS experiment at CERN has released research-quality data from particle collisions at the LHC since 2014. Almost all data from the first LHC run in 2010-2012 with the corresponding simulated samples are now in the public domain, and several scientific studies have been performed using these data. This paper summarizes the available data and tools, reviews the challenges in using them in research, and discusses measures to improve their usability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
