# An invertible transform for efficient string matching in labeled   digraphs

**Authors:** Abhinav Nellore, Austin Nguyen, and Reid F. Thompson

arXiv: 1905.03424 · 2021-11-24

## TL;DR

This paper introduces an invertible transform for labeled digraphs that enables efficient string matching and retrieval of terminal vertices, generalizing concepts like the Burrows-Wheeler transform.

## Contribution

It presents a novel invertible transform for labeled digraphs that allows efficient string matching and vertex retrieval, extending existing transforms like the Burrows-Wheeler transform.

## Key findings

- Transform enables linear-time string matching independent of graph size.
- Unique determination of the graph from its transform under certain conditions.
- Efficient retrieval of all terminal vertices matching a query string.

## Abstract

Let $G = (V, E)$ be a digraph where each vertex is unlabeled, each edge is labeled by a character in some alphabet $\Omega$, and any two edges with both the same head and the same tail have different labels. The powerset construction gives a transform of $G$ into a weakly connected digraph $G' = (V', E')$ that enables solving the decision problem of whether there is a walk in $G$ matching an arbitrarily long query string $q$ in time linear in $|q|$ and independent of $|E|$ and $|V|$. We show $G$ is uniquely determined by $G'$ when for every $v_\ell \in V$, there is some distinct string $s_\ell$ on $\Omega$ such that $v_\ell$ is the origin of a closed walk in $G$ matching $s_\ell$, and no other walk in $G$ matches $s_\ell$ unless it starts and ends at $v_\ell$. We then exploit this invertibility condition to strategically alter any $G$ so its transform $G'$ enables retrieval of all $t$ terminal vertices of walks in the unaltered $G$ matching $q$ in $O(|q| + t \log |V|)$ time. We conclude by proposing two defining properties of a class of transforms that includes the Burrows-Wheeler transform and the transform presented here.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.03424/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1905.03424/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/1905.03424/full.md

---
Source: https://tomesphere.com/paper/1905.03424