On Longest Repeat Queries Using GPU

Yun Tian; Bojian Xu

arXiv:1501.06663·cs.DC·January 28, 2015

On Longest Repeat Queries Using GPU

Yun Tian, Bojian Xu

PDF

Open Access

TL;DR

This paper introduces a GPU-accelerated method for finding all longest repeats in strings, offering faster processing and lower memory usage than the optimal solution, with practical benefits in computational biology.

Contribution

A new parallelizable solution for longest repeat queries that finds all repeats efficiently and is simpler, faster, and more memory-efficient than the existing optimal method.

Findings

01

Faster than the optimal solution by 2-3.5 times sequentially

02

Faster than the optimal solution by 6-14 times in parallel

03

Uses less memory space in practice

Abstract

Repeat finding in strings has important applications in subfields such as computational biology. The challenge of finding the longest repeats covering particular string positions was recently proposed and solved by \.{I}leri et al., using a total of the optimal $O (n)$ time and space, where $n$ is the string size. However, their solution can only find the \emph{leftmost} longest repeat for each of the $n$ string position. It is also not known how to parallelize their solution. In this paper, we propose a new solution for longest repeat finding, which although is theoretically suboptimal in time but is conceptually simpler and works faster and uses less memory space in practice than the optimal solution. Further, our solution can find \emph{all} longest repeats of every string position, while still maintaining a faster processing speed and less memory space usage. Moreover, our solution…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Network Packet Processing and Optimization · DNA and Biological Computing