Online Computation of String Net Frequency
Peaker Guo, Seeun William Umboh, Anthony Wirth, Justin Zobel

TL;DR
This paper introduces the first online algorithms for computing string net frequency in texts, achieving optimal time complexity using suffix trees, which is valuable for real-time text analysis applications.
Contribution
It presents novel online algorithms for SINGLE-NF and ALL-NF problems, improving upon offline methods with optimal time complexity using suffix tree techniques.
Findings
SINGLE-NF computed in O(m) time
ALL-NF computed in O(n) time
Algorithms are optimal and applicable in real-time scenarios
Abstract
The net frequency (NF) of a string, of length , in a text, of length , is the number of occurrences of the string in the text with unique left and right extensions. Recently, Guo et al. [CPM 2024] showed that NF is combinatorially interesting and how two key questions can be computed efficiently in the offline setting. First, SINGLE-NF: reporting the NF of a query string in an input text. Second, ALL-NF: reporting an occurrence and the NF of each string of positive NF in an input text. For many applications, however, facilitating these computations in an online manner is highly desirable. We are the first to solve the above two problems in the online setting, and we do so in optimal time, assuming, as is common, a constant-size alphabet: SINGLE-NF in time and ALL-NF in time. Our results are achieved by first designing new and simpler offline algorithms using suffix…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Network Packet Processing and Optimization · Caching and Content Delivery
