AAFLOW: Scalable Patterns for Agentic AI Workflows
Arup Kumar Sarker, Mills Staylor, Aymen Alsaadi, Gregor von Laszewski, Shantenu Jha, Geoffrey Fox

TL;DR
AAFLOW introduces a scalable, efficient distributed runtime for agentic AI workflows, significantly improving data flow and execution speed in large language model systems.
Contribution
It presents a formal execution model using operator abstraction, leveraging Apache Arrow and Cylon for high-performance, communication-efficient workflows.
Findings
Up to 4.64 times pipeline speedup
2.8 times gains in embedding and upsert phases
Retains comparable LLM generation throughput
Abstract
Agentic workflows in large language model systems integrate retrieval, reasoning, and memory, but existing frameworks suffer from scalability and reproducibility limitations due to fragmented data orchestration, serialization overhead, and non-deterministic execution. Although these frameworks increase flexibility, they don't have a formal execution model that adheres to the principles of high-performance computing. We introduce AAFLOW, a unified distributed runtime that creates communication-efficient execution plans by modeling agentic workflows as an operator abstraction. Using Apache Arrow and Cylon, AAFLOW creates a zero-copy data plane that allows direct interoperability between preprocessing, embedding, and vector retrieval without the need for serialization overhead. To lower coordination costs, it uses resource-deterministic scheduling and asynchronous batching. While retaining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
