Fast low-level pattern matching algorithm

Janja Paliska Soldo; Ana Sovic Krzic; and Damir Sersic

arXiv:1611.06115·cs.CV·November 21, 2016

Fast low-level pattern matching algorithm

Janja Paliska Soldo, Ana Sovic Krzic, and Damir Sersic

PDF

Open Access

TL;DR

This paper introduces a fast, memory-efficient pattern matching algorithm for DNA sequences that leverages modular arithmetic, multithreading, and assembly-level optimization to handle large patterns effectively.

Contribution

The paper presents a novel pattern matching method that overcomes previous size limitations by using modular arithmetic and low-level implementation for enhanced speed and memory efficiency.

Findings

01

Significant reduction in time compared to reference algorithms

02

Ability to handle larger patterns than previous methods

03

Efficient use of multithreading and assembly-level optimization

Abstract

This paper focuses on pattern matching in the DNA sequence. It was inspired by a previously reported method that proposes encoding both pattern and sequence using prime numbers. Although fast, the method is limited to rather small pattern lengths, due to computing precision problem. Our approach successfully deals with large patterns, due to our implementation that uses modular arithmetic. In order to get the results very fast, the code was adapted for multithreading and parallel implementations. The method is reduced to assembly language level instructions, thus the final result shows significant time and memory savings compared to the reference algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · DNA and Biological Computing · Network Packet Processing and Optimization