Belle II grid-based user analysis
J. V. Bennett, J. Guilliams, M. Hernandez Villanueva, D. E. Jaffe, P., J. Laycock, A. Panta, C. Serfon, I. Ueda

TL;DR
This paper discusses the challenges and considerations for grid-based user analysis at the Belle II experiment, which will handle massive data samples to explore physics beyond the Standard Model.
Contribution
It highlights specific challenges in data storage, processing, and analysis for Belle II's large-scale data, proposing areas for further development and discussion.
Findings
Identifies data management challenges at Belle II
Emphasizes need for improved analysis tools
Calls for community discussion on analysis techniques
Abstract
The Belle II experiment at the SuperKEKB accelerator is a next-generation B-factory aiming to collect 50 ab, about 50 times the data collected at Belle, to study rare processes and make precision measurements that may expose physics beyond the Standard Model. Corresponding to roughly 100 PB of storage for raw data, plus dozens of PBs per year for Monte Carlo (MC) and analysis data, these massive samples require careful planning for the storage, processing, and analysis of data. This white paper notes some of the challenges that await grid-based user-analysis at the intensity frontier and invites further discussion and exploration to improve the tools and techniques necessary to leverage the massive data samples that will be available at Belle II as part of the Snowmass process.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Advanced Data Storage Technologies · Scientific Computing and Data Management
