Constructing LZ78 Tries and Position Heaps in Linear Time for Large Alphabets
Yuto Nakashima, Tomohiro I, Shunsuke Inenaga, Hideo Bannai, Masayuki, Takeda

TL;DR
This paper introduces a worst-case linear-time algorithm for constructing LZ78 tries and position heaps over large alphabets, leveraging nearest marked ancestor queries on suffix trees.
Contribution
It presents the first linear-time algorithms for LZ78 factorization and position heap construction for large alphabets, improving efficiency over previous methods.
Findings
Linear-time LZ78 factorization algorithm for large alphabets
Linear-time construction of position heaps from tries
Utilization of nearest marked ancestor queries on suffix trees
Abstract
We present the first worst-case linear-time algorithm to compute the Lempel-Ziv 78 factorization of a given string over an integer alphabet. Our algorithm is based on nearest marked ancestor queries on the suffix tree of the given string. We also show that the same technique can be used to construct the position heap of a set of strings in worst-case linear time, when the set of strings is given as a trie.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Network Packet Processing and Optimization · semigroups and automata theory
