Templating Shuffles
Qizhen Zhang, Jiacheng Wu, Ang Chen, Vincent Liu, Boon Thau Loo

TL;DR
TeShu is a flexible network shuffling framework for data analytics in cloud data centers, using parameterized templates and sampling to optimize performance across diverse workloads and network configurations.
Contribution
It introduces parameterized shuffle templates and dynamic sampling to adapt shuffling strategies to various data center scenarios.
Findings
Improves shuffling performance in data analytics workloads.
Adapts efficiently to different data center network layouts.
Enhances flexibility and extensibility of network shuffling.
Abstract
Cloud data centers are evolving fast. At the same time, today's large-scale data analytics applications require non-trivial performance tuning that is often specific to the applications, workloads, and data center infrastructure. We propose TeShu, which makes network shuffling an extensible unified service layer common to all data analytics. Since an optimal shuffle depends on a myriad of factors, TeShu introduces parameterized shuffle templates, instantiated by accurate and efficient sampling that enables TeShu to dynamically adapt to different application workloads and data center layouts. Our preliminary experimental results show that TeShu efficiently enables shuffling optimizations that improve performance and adapt to a variety of data center network scenarios.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Software-Defined Networks and 5G · IoT and Edge/Fog Computing
