String Matching with Inversions and Translocations in Linear Average Time (Most of the Time)
Szymon Grabowski, Simone Faro, Emanuele Giaquinta

TL;DR
This paper introduces an efficient algorithm for approximate string matching that accounts for translocations and inversions, achieving linear average time complexity under certain conditions, with practical effectiveness demonstrated through experiments.
Contribution
The paper presents a novel filtering-based algorithm for approximate pattern matching allowing translocations and inversions, with proven worst-case and average-case time complexities.
Findings
Worst-case time complexity is O(nm max(α, β)).
Average-case time complexity is O(n) under certain alphabet conditions.
Experimental results show high practical efficiency.
Abstract
We present an efficient algorithm for finding all approximate occurrences of a given pattern of length in a text of length allowing for translocations of equal length adjacent factors and inversions of factors. The algorithm is based on an efficient filtering method and has an -time complexity in the worst case and -space complexity, where and are respectively the maximum length of the factors involved in any translocation and inversion. Moreover we show that under the assumptions of equiprobability and independence of characters our algorithm has a average time complexity, whenever , where and is the dimension of the alphabet. Experiments show that the new proposed algorithm achieves very good results in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
