# Space-Efficient Algorithms for Computing Minimal/Shortest Unique   Substrings

**Authors:** Takuya Mieno, Dominik K\"oppl, Yuto Nakashima, Shunsuke Inenaga, Hideo, Bannai, Masayuki Takeda

arXiv: 1905.12854 · 2020-09-15

## TL;DR

This paper introduces space-efficient data structures and algorithms for computing minimal and shortest unique substrings in a string, enabling fast, output-sensitive queries for both interval and point cases.

## Contribution

It presents novel space-efficient data structures for interval and point SUS queries with optimal query time complexity.

## Key findings

- Data structure uses 4n + o(n) bits for interval SUS queries.
- Data structure uses approximately 1.585n bits for point SUS queries.
- Algorithms efficiently compute minimal unique substrings in linear space.

## Abstract

Given a string $T$ of length $n$, a substring $u = T[i..j]$ of $T$ is called a shortest unique substring (SUS) for an interval $[s,t]$ if (a) $u$ occurs exactly once in $T$, (b) $u$ contains the interval $[s,t]$ (i.e. $i \leq s \leq t \leq j$), and (c) every substring $v$ of $T$ with $|v| < |u|$ containing $[s,t]$ occurs at least twice in $T$. Given a query interval $[s, t] \subset [1, n]$, the interval SUS problem is to output all the SUSs for the interval $[s,t]$. In this article, we propose a $4n + o(n)$ bits data structure answering an interval SUS query in output-sensitive $O(\mathit{occ})$ time, where $\mathit{occ}$ is the number of returned SUSs. Additionally, we focus on the point SUS problem, which is the interval SUS problem for $s = t$. Here, we propose a $\lceil (\log_2{3} + 1)n \rceil + o(n)$ bits data structure answering a point SUS query in the same output-sensitive time. We also propose space-efficient algorithms for computing the minimal unique substrings of $T$.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.12854/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1905.12854/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/1905.12854/full.md

---
Source: https://tomesphere.com/paper/1905.12854