Exoshuffle: An Extensible Shuffle Architecture
Frank Sifei Luan, Stephanie Wang, Samyukta Yagati, Sean Kim, Kenneth, Lien, Isaac Ong, Tony Hong, SangBin Cho, Eric Liang, Ion Stoica

TL;DR
Exoshuffle introduces an extensible, flexible shuffle architecture built on Ray, achieving high performance and scalability while enabling new ML applications and simplifying the implementation of shuffle optimizations.
Contribution
It presents a novel, decoupled shuffle architecture that is more flexible and easier to extend than monolithic systems, with competitive performance and scalability.
Findings
Achieved shuffle performance comparable to monolithic systems.
Set a new record as the most cost-efficient sorting system.
Enabled ML training applications to leverage scalable shuffle easily.
Abstract
Shuffle is one of the most expensive communication primitives in distributed data processing and is difficult to scale. Prior work addresses the scalability challenges of shuffle by building monolithic shuffle systems. These systems are costly to develop, and they are tightly integrated with batch processing frameworks that offer only high-level APIs such as SQL. New applications, such as ML training, require more flexibility and finer-grained interoperability with shuffle. They are often unable to leverage existing shuffle optimizations. We propose an extensible shuffle architecture. We present Exoshuffle, a library for distributed shuffle that offers competitive performance and scalability as well as greater flexibility than monolithic shuffle systems. We design an architecture that decouples the shuffle control plane from the data plane without sacrificing performance. We build…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Parallel Computing and Optimization Techniques · Advanced Data Storage Technologies
