From Bare Metal to Virtual: Lessons Learned when a Supercomputing Institute Deploys its First Cloud
Evan F. Bollig, James C. Wilgenbusch

TL;DR
This paper shares twelve lessons learned from rapidly deploying a compliant on-premise cloud at a supercomputing institute, highlighting technical, organizational, and user support challenges.
Contribution
It provides a comprehensive case study of building and deploying a compliant cloud infrastructure in under 18 months, emphasizing organizational and training aspects.
Findings
Successful deployment within 18 months
Importance of leadership and training in cloud projects
Technical and organizational lessons learned
Abstract
As primary provider for research computing services at the University of Minnesota, the Minnesota Supercomputing Institute (MSI) has long been responsible for serving the needs of a user-base numbering in the thousands. In recent years, MSI---like many other HPC centers---has observed a growing need for self-service, on-demand, data-intensive research, as well as the emergence of many new controlled-access datasets for research purposes. In light of this, MSI constructed a new on-premise cloud service, named Stratus, which is architected from the ground up to easily satisfy data-use agreements and fill four gaps left by traditional HPC. The resulting OpenStack cloud, constructed from HPC-specific compute nodes and backed by Ceph storage, is designed to fully comply with controls set forth by the NIH Genomic Data Sharing Policy. Herein, we present twelve lessons learned during the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
