String Representation in Suffixient Set Size Space

Hiroki Shibata; Hideo Bannai

arXiv:2604.04377·cs.DS·April 21, 2026

String Representation in Suffixient Set Size Space

Hiroki Shibata, Hideo Bannai

PDF

TL;DR

This paper introduces a new string representation scheme based on the suffixient set size measure, proving that every string can be represented efficiently within this framework.

Contribution

It presents the first representation scheme that guarantees a size proportional to the suffixient set size, using a novel substring equation system model.

Findings

01

Every string admits an SES of size O(χ(w))

02

The representation scheme is the first of its kind for this measure

03

The construction is based on a new substring equation system model

Abstract

Repetitiveness measures quantify how much repetitive structure a string contains and serve as parameters for compressed representations and indexing data structures. We study the measure $χ$ , defined as the size of the smallest suffixient set. Although $χ$ has been studied extensively, its reachability, whether every string $w$ admits a string representation of size $O (χ (w))$ words, has remained an important open problem. We answer this question affirmatively by presenting the first such representation scheme. Our construction is based on a new model, the substring equation system (SES), and we show that every string admits an SES of size $O (χ (w))$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.