OnePiece: A Large-Scale Distributed Inference System with RDMA for Complex AI-Generated Content (AIGC) Workflows
June Chen, Neal Xu, Gragas Huang, Bok Zhou, Stephen Liu

TL;DR
OnePiece is a scalable distributed inference system optimized with RDMA for complex AI-Generated Content workflows, significantly improving throughput, resource utilization, and scalability in production environments.
Contribution
It introduces a microservice-based architecture with RDMA optimization, a deadlock-resolving double-ring buffer, and elastic resource management for efficient AIGC inference.
Findings
Reduces GPU resource consumption by 16x in image-to-video generation.
Improves throughput and scalability for multi-stage AIGC workflows.
Enhances fault tolerance and resource efficiency in production settings.
Abstract
The rapid growth of AI-generated content (AIGC) has enabled high-quality creative production across diverse domains, yet existing systems face critical inefficiencies in throughput, resource utilization, and scalability under concurrent workloads. This paper introduces OnePiece, a large-scale distributed inference system with RDMA optimized for multi-stage AIGC workflows. By decomposing pipelines into fine-grained microservices and leveraging one-sided RDMA communication, OnePiece significantly reduces inter-node latency and CPU overhead while improving GPU utilization. The system incorporates a novel double-ring buffer design to resolve deadlocks in RDMA-aware memory access without CPU involvement. Additionally, a dynamic Node Manager allocates resources elastically across workflow stages in response to real-time load. Experimental results demonstrate that OnePiece reduces GPU resource…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques · Ferroelectric and Negative Capacitance Devices · Advanced Neural Network Applications
