A Model for Communication in Clusters of Multi-core Machines
Christine Task, Arun Chauhan

TL;DR
This paper introduces a new formal model for communication in multi-core clusters, aiming to improve the design and optimization of collective communication algorithms by accurately capturing multi-core architectures.
Contribution
It presents a novel formal model that accounts for multi-core properties, enabling better optimization of communication patterns in modern distributed systems.
Findings
The model effectively captures shared-memory and network communication properties.
It provides a foundation for designing more efficient collective algorithms.
Potential for significant performance improvements in multi-core clusters.
Abstract
A common paradigm for scientific computing is distributed message-passing systems, and a common approach to these systems is to implement them across clusters of high-performance workstations. As multi-core architectures become increasingly mainstream, these clusters are very likely to include multi-core machines. However, the theoretical models which are currently used to develop communication algorithms across these systems do not take into account the unique properties of processes running on shared-memory architectures, including shared external network connections and communication via shared memory locations. Because of this, existing algorithms are far from optimal for modern clusters. Additionally, recent attempts to adapt these algorithms to multicore systems have proceeded without the introduction of a more accurate formal model and have generally neglected to capitalize on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Cloud Computing and Resource Management · Distributed systems and fault tolerance
