Expanding IceCube GPU computing into the Clouds
Igor Sfiligoi, Shava Smallen, Frank W\"urthwein, Nicole Wolter, David, Schultz, Benedikt Riedel

TL;DR
This paper discusses integrating commercial cloud GPU resources into IceCube's workflow, doubling GPU usage and demonstrating cost-effective expansion of computational capacity for scientific research.
Contribution
It presents a method for seamlessly integrating cloud GPU resources into existing scientific workflows, expanding computational capacity efficiently.
Findings
GPU wall hours doubled in 2 weeks
Added over 3.1 fp32 EFLOP hours
Cost approximately $58,000
Abstract
The IceCube collaboration relies on GPU compute for many of its needs, including ray tracing simulation and machine learning activities. GPUs are however still a relatively scarce commodity in the scientific resource provider community, so we expanded the available resource pool with GPUs provisioned from the commercial Cloud providers. The provisioned resources were fully integrated into the normal IceCube workload management system through the Open Science Grid (OSG) infrastructure and used CloudBank for budget management. The result was an approximate doubling of GPU wall hours used by IceCube over a period of 2 weeks, adding over 3.1 fp32 EFLOP hours for a price tag of about $58k. This paper describes the setup used and the operational experience.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
