Octopus: Experiences with a Hybrid Event-Driven Architecture for Distributed Scientific Computing
Haochen Pan, Ryan Chard, Sicheng Zhou, Alok Kamatar, Rafael Vescovi,, Val\'erie Hayot-Sasson, Andr\'e Bauer, Maxime Gonthier, Kyle Chard, Ian, Foster

TL;DR
Octopus is a scalable, hybrid event-driven architecture designed for distributed scientific computing, enabling high-throughput event processing, reliable communication, and fine-grained access control across diverse research infrastructures.
Contribution
The paper introduces Octopus, a novel hybrid cloud-edge event fabric that supports scalable, reliable, and secure event-driven scientific applications with automatic processing triggers.
Findings
Supports over 4.2 million events/sec for production
Supports over 9.6 million events/sec for consumption
Demonstrates applicability to various scientific use cases
Abstract
Scientific research increasingly relies on distributed computational resources, storage systems, networks, and instruments, ranging from HPC and cloud systems to edge devices. Event-driven architecture (EDA) benefits applications targeting distributed research infrastructures by enabling the organization, communication, processing, reliability, and security of events generated from many sources. To support the development of scientific EDA, we introduce Octopus, a hybrid, cloud-to-edge event fabric designed to link many local event producers and consumers with cloud-hosted brokers. Octopus can be scaled to meet demand, permits the deployment of highly available Triggers for automatic event processing, and enforces fine-grained access control. We identify requirements in self-driving laboratories, scientific data automation, online task scheduling, epidemic modeling, and dynamic workflow…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Distributed and Parallel Computing Systems
