Online algorithms for finding distinct substrings with length and multiple prefix and suffix conditions
Laurentius Leonard, Shunsuke Inenaga, Hideo Bannai, Takuya Mieno

TL;DR
This paper presents efficient online algorithms for finding distinct substrings within a string that meet specific prefix and suffix conditions, with applications in network traffic classification.
Contribution
The authors introduce new algorithms with optimal preprocessing and query times for counting and reporting substrings satisfying prefix and suffix constraints in an online setting.
Findings
Preprocessing time is $O(( ext{length}(P)+ ext{length}(S)) imes ext{log} ext{sigma})$.
Query time for counting is $O(|T_i| imes ext{log} ext{sigma})$.
Algorithms are applicable to network traffic classification.
Abstract
Let two static sequences of strings and , representing prefix and suffix conditions respectively, be given as input for preprocessing. For the query, let two positive integers and be given, as well as a string given in an online manner, such that represents the length- prefix of for . In this paper we are interested in computing the set of distinct substrings of such that , and contains some as a prefix and some as a suffix. More specifically, the counting problem is to output , whereas the reporting problem is to output all elements of , for each iteration . Let denote the alphabet size, and for a sequence of strings , . Then, we show that after $O((\Vert P\Vert +\Vert…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Network Packet Processing and Optimization · semigroups and automata theory
