Operon: Incremental Construction of Ragged Data via Named Dimensions
Sungbin Moon, Jiho Park, Suyoung Hwang, Donghyun Koh, Seunghyun Moon, Minhyeong Lee

TL;DR
Operon is a Rust-based workflow engine that efficiently manages ragged data with named dimensions, enabling correct, incremental, and parallel data processing in complex workflows.
Contribution
It introduces a novel formalism of named dimensions with explicit dependencies and a static verification system for ragged data workflows.
Findings
Operon reduces baseline overhead by 14.94x compared to existing engines.
It guarantees deterministic and confluent execution in parallel environments.
Operon maintains near-linear output rates as workloads scale.
Abstract
Modern data processing workflows frequently encounter ragged data: collections with variable-length elements that arise naturally in domains like natural language processing, scientific measurements, and autonomous AI agents. Existing workflow engines lack native support for tracking the shapes and dependencies inherent to ragged data, forcing users to manage complex indexing and dependency bookkeeping manually. We present Operon, a Rust-based workflow engine that addresses these challenges through a novel formalism of named dimensions with explicit dependency relations. Operon provides a domain-specific language where users declare pipelines with dimension annotations that are statically verified for correctness, while the runtime system dynamically schedules tasks as data shapes are incrementally discovered during execution. We formalize the mathematical foundation for reasoning about…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Machine Learning in Materials Science · Machine Learning and Data Classification
