Sub-string/Pattern Matching in Sub-linear Time Using a Sparse Fourier Transform Approach
Nagaraj T. Janakiraman, Avinash Vem, Krishna R. Narayanan,, Jean-Francois Chamberland

TL;DR
This paper introduces a sub-linear time algorithm for substring matching in large strings using sparse Fourier transforms, significantly improving efficiency over previous methods especially for approximate matches.
Contribution
The paper presents a novel sub-linear time approach for exact and approximate substring matching using sparse Fourier transforms, with improved computational complexity and sketching requirements.
Findings
Achieves sub-linear time complexity for substring matching
Handles approximate matches within Hamming distance K
Reduces Fourier transform coefficients needed for matching
Abstract
We consider the problem of querying a string (or, a database) of length bits to determine all the locations where a substring (query) of length appears either exactly or is within a Hamming distance of from the query. We assume that sketches of the original signal can be computed off line and stored. Using the sparse Fourier transform computation based approach introduced by Pawar and Ramchandran, we show that all such matches can be determined with high probability in sub-linear time. Specifically, if the query length and the number of matches , we show that for all the matching positions can be determined with a probability that approaches 1 as for . More importantly our scheme has a worst-case computational complexity that is only $O\left(\max\{N^{1-\mu}\log^2 N,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Machine Learning and Algorithms · Network Packet Processing and Optimization
