Porting and optimizing UniFrac for GPUs
Igor Sfiligoi, Daniel McDonald, Rob Knight

TL;DR
This paper details the porting and optimization of the UniFrac microbiome comparison metric to GPUs, significantly reducing computation time for large datasets through code optimization and parallelization.
Contribution
The paper presents a GPU implementation of Striped UniFrac, achieving substantial speedups over CPU versions with minimal precision loss.
Findings
Reduced computation time from 13 hours to 12 minutes on GPU
Achieved sub-2-hour processing for 113k samples dataset on GPU
Provided a BSD-licensed GPU-accelerated UniFrac implementation
Abstract
UniFrac is a commonly used metric in microbiome research for comparing microbiome profiles to one another ("beta diversity"). The recently implemented Striped UniFrac added the capability to split the problem into many independent subproblems and exhibits near linear scaling. In this paper we describe steps undertaken in porting and optimizing Striped Unifrac to GPUs. We reduced the run time of computing UniFrac on the published Earth Microbiome Project dataset from 13 hours on an Intel Xeon E5-2680 v4 CPU to 12 minutes on an NVIDIA Tesla V100 GPU, and to about one hour on a laptop with NVIDIA GTX 1050 (with minor loss in precision). Computing UniFrac on a larger dataset containing 113k samples reduced the run time from over one month on the CPU to less than 2 hours on the V100 and 9 hours on an NVIDIA RTX 2080TI GPU (with minor loss in precision). This was achieved by using OpenACC for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMicrobial Community Ecology and Physiology · Gut microbiota and health · Scientific Computing and Data Management
