The Rightmost Equal-Cost Position Problem
Maxime Crochemore, Alessio Langiu, Filippo Mignosi

TL;DR
This paper introduces the Rightmost Equal-Cost Position (REP) problem in text compression, proposing a data structure that efficiently finds occurrences with equal encoding costs, optimizing LZ77-based compression.
Contribution
The paper presents the Multi-Layer Suffix Tree, enabling constant-time retrieval of REP for LPF and related patterns, advancing compression algorithms.
Findings
REP can be found in constant time for LPF.
The data structure supports efficient pattern queries.
Optimizes LZ77-based compression by selecting cost-equivalent occurrences.
Abstract
LZ77-based compression schemes compress the input text by replacing factors in the text with an encoded reference to a previous occurrence formed by the couple (length, offset). For a given factor, the smallest is the offset, the smallest is the resulting compression ratio. This is optimally achieved by using the rightmost occurrence of a factor in the previous text. Given a cost function, for instance the minimum number of bits used to represent an integer, we define the Rightmost Equal-Cost Position (REP) problem as the problem of finding one of the occurrences of a factor which cost is equal to the cost of the rightmost one. We present the Multi-Layer Suffix Tree data structure that, for a text of length n, at any time i, it provides REP(LPF) in constant time, where LPF is the longest previous factor, i.e. the greedy phrase, a reference to the list of REP({set of prefixes of LPF}) in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Natural Language Processing Techniques · semigroups and automata theory
