Fast low-level pattern matching algorithm
Janja Paliska Soldo, Ana Sovic Krzic, and Damir Sersic

TL;DR
This paper introduces a fast, memory-efficient pattern matching algorithm for DNA sequences that leverages modular arithmetic, multithreading, and assembly-level optimization to handle large patterns effectively.
Contribution
The paper presents a novel pattern matching method that overcomes previous size limitations by using modular arithmetic and low-level implementation for enhanced speed and memory efficiency.
Findings
Significant reduction in time compared to reference algorithms
Ability to handle larger patterns than previous methods
Efficient use of multithreading and assembly-level optimization
Abstract
This paper focuses on pattern matching in the DNA sequence. It was inspired by a previously reported method that proposes encoding both pattern and sequence using prime numbers. Although fast, the method is limited to rather small pattern lengths, due to computing precision problem. Our approach successfully deals with large patterns, due to our implementation that uses modular arithmetic. In order to get the results very fast, the code was adapted for multithreading and parallel implementations. The method is reduced to assembly language level instructions, thus the final result shows significant time and memory savings compared to the reference algorithm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · Network Packet Processing and Optimization
