Power Redistribution for Optimizing Performance in MPI Clusters
Ramy Medhat, Borzoo Bonakdarpour, Sebastian Fischmeister

TL;DR
This paper proposes a power redistribution method for MPI clusters that optimizes performance under power constraints by dynamically reallocating power from idle nodes to lagging nodes, significantly reducing execution time.
Contribution
It introduces an ILP-based optimal power distribution algorithm and an online heuristic for dynamic power redistribution in MPI applications.
Findings
Heuristic achieves up to 2.25x speedup.
Power redistribution reduces total execution time.
Method maintains power bounds while optimizing performance.
Abstract
Power efficiency has recently become a major concern in the high-performance computing domain. HPC centers are provisioned by a power bound which impacts execution time. Naturally, a tradeoff arises between power efficiency and computational efficiency. This paper tackles the problem of performance optimization for MPI applications, where a power bound is assumed. The paper exposes a subset of HPC applications that leverage cluster parallelism using MPI, where nodes encounter multiple synchronization points and exhibit inter-node dependency. We abstract this structure into a dependency graph, and leverage the asymmetry in execution time of parallel jobs on different nodes by redistributing power gained from idling a blocked node to nodes that are lagging in their jobs. We introduce a solution based on integer linear programming (ILP) for optimal power distribution algorithm that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Interconnection Networks and Systems
