Open-MPI over MOSIX: paralleled computing in a clustered world
Adam Lev-Libfeld, Alex Margolin, Amnon Barak

TL;DR
This paper presents an integrated approach combining MOSIX process migration with Open-MPI, introducing a direct communication module to reduce overhead and improve runtime in clustered parallel computing environments.
Contribution
It introduces a novel module for direct communication between migrated Open-MPI processes, reducing communication overhead and enhancing performance in cluster computing.
Findings
Reduced run-time through improved resource allocation
Effective direct communication between migrated processes
Mitigated TCP/IP communication latency issues
Abstract
Recent increased interest in Cloud computing emphasizes the need to find an adequate solution to the load-balancing problem in parallel computing -- efficiently running several jobs concurrently on a cluster of shared computers (nodes). One approach to solve this problem is by preemptive process migration -- the transfer of running processes between nodes. A possible drawback of this approach is the increased overhead between heavily communicating processes. This project presents a solution to this last problem by incorporating the process migration capability of MOSIX into Open-MPI and by reducing the resulting communication overhead. Specifically, we developed a module for direct communication (DiCOM) between migrated Open-MPI processes, to overcome the increased communication latency of TCP/IP between such processes. The outcome is reduced run-time by improved resource allocation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Interconnection Networks and Systems · Cloud Computing and Resource Management
