CUDAMPF++: A Proactive Resource Exhaustion Scheme for Accelerating   Homologous Sequence Search on CUDA-enabled GPU

Hanyu Jiang; Narayan Ganesan; Yu-Dong Yao

arXiv:1707.09683·cs.CE·June 7, 2018

CUDAMPF++: A Proactive Resource Exhaustion Scheme for Accelerating Homologous Sequence Search on CUDA-enabled GPU

Hanyu Jiang, Narayan Ganesan, Yu-Dong Yao

PDF

TL;DR

CUDAMPF++ is a GPU-based framework that significantly accelerates homologous sequence search in bioinformatics by optimizing parallelism and resource utilization, outperforming existing CPU and GPU methods.

Contribution

The paper introduces CUDAMPF++, a five-tiered GPU framework that enhances HMMER3's sequence alignment pipeline with architecture-aware design and cache sacrifice techniques for improved performance.

Findings

01

Peak performance of 283.9 GCUPS on GPU

02

Speedups up to 168.3x over CPU implementation

03

Consistent performance across various datasets

Abstract

Genomic sequence alignment is an important research topic in bioinformatics and continues to attract significant efforts. As genomic data grow exponentially, however, most of alignment methods face challenges due to their huge computational costs. HMMER, a suite of bioinformatics tools, is widely used for the analysis of homologous protein and nucleotide sequences with high sensitivity, based on profile hidden Markov models (HMMs). Its latest version, HMMER3, introdues a heuristic pipeline to accelerate the alignment process, which is carried out on central processing units (CPUs) with the support of streaming SIMD extensions (SSE) instructions. Few acceleration results have since been reported based on HMMER3. In this paper, we propose a five-tiered parallel framework, CUDAMPF++, to accelerate the most computationally intensive stages of HMMER3's pipeline, multiple/single segment…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.