Ultra-fast Multiple Genome Sequence Matching Using GPU
Gang Liao, Qi Sun, Longfei Ma, Sha Ding, Wen Xie

TL;DR
This paper demonstrates that GPU-accelerated suffix array-based algorithms significantly outperform CPU-based methods for fast multiple genome sequence matching, achieving over 99x speedup.
Contribution
It provides a comprehensive comparison of suffix tree and suffix array implementations on GPU, highlighting the superior efficiency of suffix array for genome matching.
Findings
Suffix array uses 20-30% of suffix tree space.
GPU suffix array implementation is over 99 times faster than CPU serial.
Parallel suffix array matching is highly effective for bioinformatics.
Abstract
In this paper, a contrastive evaluation of massively parallel implementations of suffix tree and suffix array to accelerate genome sequence matching are proposed based on Intel Core i7 3770K quad-core and NVIDIA GeForce GTX680 GPU. Besides suffix array only held approximately 20%~30% of the space relative to suffix tree, the coalesced binary search and tile optimization make suffix array clearly outperform suffix tree using GPU. Consequently, the experimental results show that multiple genome sequence matching based on suffix array is more than 99 times speedup than that of CPU serial implementation. There is no doubt that massively parallel matching algorithm based on suffix array is an efficient approach to high-performance bioinformatics applications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Genomics and Phylogenetic Studies · Machine Learning in Bioinformatics
