A Communication- and Memory-Aware Model for Load Balancing Tasks
Jonathan Lifflander, Philippe P. Pebay, Nicole L. Slattengren, Pierre, L. Pebay, Robert A. Pfeiffer, Joseph D. Kotulski, Sean T. McGovern

TL;DR
This paper introduces a unified model for load balancing in distributed systems that considers computation, communication, and memory, enabling better task placement and achieving significant speedups.
Contribution
It presents a novel reduced-order model and a distributed heuristic algorithm for load balancing that effectively explores complex tradeoffs in distributed-memory systems.
Findings
Achieves up to 2.3x speedup on electromagnetics code
Demonstrates quick convergence to near-optimal solutions
Formalizes the optimization as a mixed-integer linear program
Abstract
While load balancing in distributed-memory computing has been well-studied, we present an innovative approach to this problem: a unified, reduced-order model that combines three key components to describe "work" in a distributed system: computation, communication, and memory. Our model enables an optimizer to explore complex tradeoffs in task placement, such as increased parallelism at the expense of data replication, which increases memory usage. We propose a fully distributed, heuristic-based load balancing optimization algorithm, and demonstrate that it quickly finds close-to-optimal solutions. We formalize the complex optimization problem as a mixed-integer linear program, and compare it to our strategy. Finally, we show that when applied to an electromagnetics code, our approach obtains up to 2.3x speedups for the imbalanced execution.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Real-Time Systems Scheduling · Parallel Computing and Optimization Techniques
