Scaling All-to-all Operations Across Emerging Many-Core Supercomputers

Shannon Kinkead; Jackson Wesley; Whit Schonbein; David DeBonis; Matthew G. F. Dosanjh; and Amanda Bienz

arXiv:2601.17606·cs.DC·January 27, 2026

Scaling All-to-all Operations Across Emerging Many-Core Supercomputers

Shannon Kinkead, Jackson Wesley, Whit Schonbein, David DeBonis, Matthew G. F. Dosanjh, and Amanda Bienz

PDF

Open Access

TL;DR

This paper introduces new all-to-all communication algorithms optimized for emerging many-core supercomputers, demonstrating up to three times faster performance than existing MPI implementations on Sapphire Rapids systems.

Contribution

The paper presents novel all-to-all algorithms tailored for many-core architectures and provides a comprehensive performance analysis against existing methods.

Findings

01

Achieved up to 3x speedup over system MPI

02

Demonstrated effectiveness on Sapphire Rapids systems

03

Provided detailed performance analysis of algorithms

Abstract

Performant all-to-all collective operations in MPI are critical to fast Fourier transforms, transposition, and machine learning applications. There are many existing implementations for all-to-all exchanges on emerging systems, with the achieved performance dependent on many factors, including message size, process count, architecture, and parallel system partition. This paper presents novel all-to-all algorithms for emerging many-core systems. Further, the paper presents a performance analysis against existing algorithms and system MPI, with novel algorithms achieving up to 3x speedup over system MPI at 32 nodes of state-of-the-art Sapphire Rapids systems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Interconnection Networks and Systems · Embedded Systems Design Techniques