FlowMesh: A Service Fabric for Composable LLM Workflows
Junyi Shen, Noppanat Wadlom, Lingfeng Zhou, Dequan Wang, Xu Miao, Lei Fang, Yao Lu

TL;DR
FlowMesh is a scalable, cost-effective service fabric that optimizes composable LLM workflows by decomposing tasks, sharing work across users, and dynamically managing resources, significantly reducing costs and energy use.
Contribution
It introduces FlowMesh, a novel multi-tenant service fabric that efficiently executes and optimizes complex LLM workflows with fine-grained operators and a global control plane.
Findings
Up to 3.8x cost reduction compared to baseline solutions
2.0x lower energy consumption
Maintains efficiency under dynamic and failure-prone conditions
Abstract
AI deployment increasingly resembles a pipeline of data transformation, fine-tuning, and agent interactions rather than a monolithic LLM job; recent examples include RLHF/RLAIF training and agentic workflows. To cope with this shift, we propose FlowMesh, a multi-tenant service fabric that executes and optimizes these workloads as one shared service instead of isolated pipelines. It decomposes workflows into fine-grained operators with recorded lineage, enabling de-duplication of work across users and batching requests on the same hardware while preserving per-workflow provenance. A global control plane maintains a cluster-wide pool of ready operators and uses a single utility function to pick both the batch and the worker, balancing throughput, cost, and data locality on heterogeneous GPUs. The data plane is an elastic fleet of stateless workers backed by a content-addressable store,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Machine Learning in Materials Science · Distributed and Parallel Computing Systems
