GPU Scheduler for De Novo Genome Assembly with Multiple MPI Processes
Minhao Li, Siyu Wang, Guanghao Wei

TL;DR
This paper introduces three GPU schedulers for the ELBA algorithm to improve performance in de novo genome assembly, enabling multiple MPI processes to efficiently utilize GPUs and achieve significant speed-ups.
Contribution
It proposes novel GPU schedulers that allow multiple MPI processes to share GPUs effectively, enhancing ELBA's scalability and performance in genome assembly tasks.
Findings
Achieved 7-8x speed-up with the best scheduler.
Significant performance improvements over baseline ELBA.
Trade-offs identified between parallelism and scheduler overhead.
Abstract
Genome assembly is one of the most important tasks in computational biology. ELBA is the state-of-the-art distributed-memory parallel algorithm for overlap detection and layout simplification steps of genome assembly but exists a performance bottleneck in pairwise alignment. In this work, we proposed 3 GPU schedulers for ELBA to accommodate multiple MPI processes and multiple GPUs. The GPU schedulers enable multiple MPI processes to perform computation on GPUs in a round-robin fashion. Both strong and weak scaling experiments show that 3 schedulers are able to significantly improve the performance of baseline while there is a trade-off between parallelism and GPU scheduler overhead. For the best performance implementation, the one-to-one scheduler achieves 7-8 speed-up using 25 MPI processes compared with the baseline vanilla ELBA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Genomics and Chromatin Dynamics · Gene expression and cancer classification
