Cybercosm: New Foundations for a Converged Science Data Ecosystem
Mark Asch, Fran\c{c}ois Bodin, Micah Beck, Terry Moore, Michela, Taufer, Martin Swany, Jean-Pierre Vilotte

TL;DR
Cybercosm proposes a novel distributed system architecture that virtualizes and converges local resources across nodes, enabling scalable, portable, and interoperable workflows for data-intensive science amidst growing data volumes and AI integration.
Contribution
It introduces a minimally sufficient hypervisor layer that virtualizes system resources, facilitating ecosystem convergence and workflow portability in scientific data environments.
Findings
Supports scalable and portable workflows
Enables resource sharing across distributed systems
Facilitates ecosystem convergence for data-intensive science
Abstract
Scientific communities naturally tend to organize around data ecosystems created by the combination of their observational devices, their data repositories, and the workflows essential to carry their research from observation to discovery. However, these legacy data ecosystems are now breaking down under the pressure of the exponential growth in the volume and velocity of these workflows, which are further complicated by the need to integrate the highly data intensive methods of the Artificial Intelligence revolution. Enabling ground breaking science that makes full use of this new, data saturated research environment will require distributed systems that support dramatically improved resource sharing, workflow portability and composability, and data ecosystem convergence. The Cybercosm vision presented in this white paper describes a radically different approach to the architecture…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Distributed and Parallel Computing Systems · Advanced Data Storage Technologies
