Approaches to the Parallelization of Merge Sort in Python
Alexandra Yang

TL;DR
This paper explores various parallel merge sort algorithms in Python, demonstrating that hybrid multiprocessing approaches significantly improve sorting speed on supercomputers, advancing divide-and-conquer parallelization techniques.
Contribution
It introduces and compares multiple parallel merge sort implementations in Python, highlighting the effectiveness of hybrid multiprocessing methods for large-scale data sorting.
Findings
Hybrid multiprocessing merge sort achieves 1.5x speedup over Python's built-in sorted()
34x faster than sequential merge sort
Provides insights into parallel merge sort implementation in Python on shared and distributed systems
Abstract
The theory of divide-and-conquer parallelization has been well-studied in the past, providing a solid basis upon which to explore different approaches to the parallelization of merge sort in Python. Python's simplicity and extensive selection of libraries make it the most popular scientific programming language, so it is a fitting language in which to implement and analyze these algorithms. In this paper, we use Python packages multiprocessing and mpi4py to implement several different parallel merge sort algorithms. Experiments are conducted on an academic supercomputer, upon which benchmarks are performed using Cloudmesh. We find that hybrid multiprocessing merge sort outperforms several other algorithms, achieving a 1.5x speedup compared to the built-in Python sorted() and a 34x speedup compared to sequential merge sort. Our results provide insight into different approaches to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Algorithms and Data Compression
