Data intensive physics analysis in Azure cloud
Igor Sfiligoi, Frank W\"urthwein, Diego Davila

TL;DR
This paper discusses how UCSD leveraged Azure cloud resources, integrated with the Open Science Grid, to enhance data-intensive physics analysis for the CMS experiment, reducing analysis time and improving resource flexibility.
Contribution
It demonstrates the integration of commercial cloud resources with existing scientific grid infrastructure for data-intensive physics analysis.
Findings
Successful deployment of Azure cloud resources for CMS data analysis
Enhanced data caching infrastructure improved job efficiency
Reduced time to results for physics analyses
Abstract
The Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) is one of the largest data producers in the scientific world, with standard data products centrally produced, and then used by often competing teams within the collaboration. This work is focused on how a local institution, University of California San Diego (UCSD), partnered with the Open Science Grid (OSG) to use Azure cloud resources to augment its available computing to accelerate time to results for multiple analyses pursued by a small group of collaborators. The OSG is a federated infrastructure allowing many independent resource providers to serve many independent user communities in a transparent manner. Historically the resources would come from various research institutions, spanning small universities to large HPC centers, based on either community needs or grant allocations, so adding commercial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Big Data Technologies and Applications · Scientific Computing and Data Management
