TD-Orch: Scalable Load-Balancing for Distributed Systems with Applications to Graph Processing
Yiwei Zhao, Qiushi Lin, Hongbo Kang, Guy E. Blelloch, Laxman Dhulipala, Yan Gu, Charles McGuffey, Phillip B. Gibbons

TL;DR
TD-Orch introduces a scalable task-data orchestration framework for distributed systems, significantly improving load balancing and performance in applications like graph processing by leveraging a bidirectional push-pull technique.
Contribution
The paper presents TD-Orch, a novel scalable orchestration framework with a simple interface, and demonstrates its effectiveness through a distributed graph processing system TDO-GP.
Findings
TD-Orch achieves up to 2.8x speedup over existing schedulers.
TDO-GP attains an average of 4.1x speedup over prior graph systems.
The framework effectively handles highly skewed data requests with minimal communication.
Abstract
In this paper, we introduce a task-data orchestration abstraction that supports a range of distributed applications, including graph processing and key-value stores. Given a batch of lambda tasks each requesting one or more data items, where both tasks and data are distributed across multiple machines, each task must be co-located with its target data (by moving tasks and/or data) and then executed. We present TD-Orch, an efficient and scalable orchestration framework featuring a simple application developer interface. TD-Orch employs a distributed push-pull technique, leveraging the bidirectional flow of both tasks and data to achieve scalable load balance across machines even under highly skewed data requests (data hot spots), with minimal communication overhead. Experimental results show that TD-Orch achieves up to 2.8x speedup over existing distributed scheduling baselines. Building…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Cloud Computing and Resource Management · Advanced Database Systems and Queries
