Rosetta: a container-centric science platform for resource-intensive, interactive data analysis
Stefano Alberto Russo, Sara Bertocco, Claudio Gheller, Giuliano, Taffoni

TL;DR
Rosetta is a versatile science platform that leverages containerized microservices to enable resource-intensive, interactive data analysis across various scientific domains, supporting custom environments and multiple computing resources.
Contribution
It introduces a novel container-centric architecture based on microservices for flexible, reproducible, and resource-efficient interactive data analysis.
Findings
Supports a wide range of container engines and runtimes
Enables seamless integration with workload management systems
Applicable across multiple scientific disciplines
Abstract
Rosetta is a science platform for resource-intensive, interactive data analysis which runs user tasks as software containers. It is built on top of a novel architecture based on framing user tasks as microservices - independent and self-contained units - which allows to fully support custom and user-defined software packages, libraries and environments. These include complete remote desktop and GUI applications, besides common analysis environments as the Jupyter Notebooks. Rosetta relies on Open Container Initiative containers, which allow for safe, effective and reproducible code execution; can use a number of container engines and runtimes; and seamlessly supports several workload management systems, thus enabling containerized workloads on a wide range of computing resources. Although developed in the astronomy and astrophysics space, Rosetta can virtually support any science and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Scientific Computing and Data Management · Cloud Computing and Resource Management
