Efficient Sequential and Parallel Algorithms for Planted Motif Search
Marius Nicolae, Sanguthevar Rajasekaran

TL;DR
This paper introduces PMS8, an exact parallel algorithm for Planted Motif Search that efficiently solves challenging instances and larger parameters, advancing the computational methods for biological motif detection.
Contribution
The paper presents PMS8, the first parallel algorithm capable of solving difficult (l,d) instances and larger parameters for PMS, with new conditions for common d-neighbors among l-mers.
Findings
PMS8 solves (25,10) and (26,11) instances.
PMS8 is efficient on larger instances like (50,21).
Introduces conditions for 3 l-mers to share a common d-neighbor.
Abstract
Motif searching is an important step in the detection of rare events occurring in a set of DNA or protein sequences. One formulation of the problem is known as (l,d)-motif search or Planted Motif Search (PMS). In PMS we are given two integers l and d and n biological sequences. We want to find all sequences of length l that appear in each of the input sequences with at most d mismatches. The PMS problem is NP-complete. PMS algorithms are typically evaluated on certain instances considered challenging. This paper presents an exact parallel PMS algorithm called PMS8. PMS8 is the first algorithm to solve the challenging (l,d) instances (25,10) and (26,11). PMS8 is also efficient on instances with larger l and d such as (50,21). This paper also introduces necessary and sufficient conditions for 3 l-mers to have a common d-neighbor.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Genomics and Phylogenetic Studies · Genomics and Chromatin Dynamics
