Optimal LZ-End Parsing is Hard
Hideo Bannai, Mitsuru Funakoshi, Kazuhiro Kurita, Yuto Nakashima,, Kazuhisa Seto, Takeaki Uno

TL;DR
This paper proves that finding the optimal LZ-End parsing with the fewest phrases is NP-complete, introduces a MAX-SAT formulation for it, and analyzes the approximation ratio of greedy parsing.
Contribution
It establishes the computational hardness of optimal LZ-End parsing and provides a MAX-SAT approach and approximation bounds, advancing understanding of this compression method.
Findings
Optimal LZ-End parsing decision problem is NP-complete.
A MAX-SAT formulation for the problem is provided.
The greedy parsing can be up to twice as large as the optimal.
Abstract
LZ-End is a variant of the well-known Lempel-Ziv parsing family such that each phrase of the parsing has a previous occurrence, with the additional constraint that the previous occurrence must end at the end of a previous phrase. LZ-End was initially proposed as a greedy parsing, where each phrase is determined greedily from left to right, as the longest factor that satisfies the above constraint~[Kreft & Navarro, 2010]. In this work, we consider an optimal LZ-End parsing that has the minimum number of phrases in such parsings. We show that a decision version of computing the optimal LZ-End parsing is NP-complete by showing a reduction from the vertex cover problem. Moreover, we give a MAX-SAT formulation for the optimal LZ-End parsing adapting an approach for computing various NP-hard repetitiveness measures recently presented by [Bannai et al., 2022]. We also consider the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
