RLE edit distance in near optimal time

Rapha\"el Clifford; Pawe{\l} Gawrychowski; Tomasz Kociumaka; Daniel P.; Martin; Przemys{\l}aw Uzna\'nski

arXiv:1905.01254·cs.DS·May 6, 2019

RLE edit distance in near optimal time

Rapha\"el Clifford, Pawe{\l} Gawrychowski, Tomasz Kociumaka, Daniel P., Martin, Przemys{\l}aw Uzna\'nski

PDF

TL;DR

This paper presents a near-optimal algorithm for computing edit distance between run-length encoded strings, significantly improving previous results and approaching theoretical limits under standard complexity assumptions.

Contribution

The authors develop an algorithm that computes run-length encoded edit distance in near-optimal time, closing a research gap since 1993.

Findings

01

Achieves $ ilde{O}(mn)$ time complexity for run-length encoded strings.

02

Improves previous algorithms by a factor of $ ilde{n}/ ext{log}(mn)$.

03

Time complexity is near optimal under SETH-hardness.

Abstract

We show that the edit distance between two run-length encoded strings of compressed lengths $m$ and $n$ respectively, can be computed in $O (mn lo g (mn))$ time. This improves the previous record by a factor of $O (n / lo g (mn))$ . The running time of our algorithm is within subpolynomial factors of being optimal, subject to the standard SETH-hardness assumption. This effectively closes a line of algorithmic research first started in 1993.

Tables2

Table 1. Table 1: Types of internal turning points and their behaviour subject to the SWM operation.

Type DI	Type ID	Type IF	Type FD	Type FI	Type DF

Table 2. Table 2: Types of endpoints and their behaviour subject to the SWM operation.

Type -I	Type -F	Type -D	Type I-	Type F-	Type D-

Equations5

ED (i, j) = min (

ED (i, j) = min (

OUT_{LEFT}

OUT_{LEFT}

OUT_{TOP}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

RLE edit distance in near optimal time

Raphaël Clifford

Department of Computer Science, University of Bristol, UK

Paweł Gawrychowski

Institute of Computer Science, University of Wrocław, Poland

Tomasz Kociumaka Supported by ISF grants no. 824/17 and 1278/16 and by an ERC grant MPM under the EU’s Horizon 2020 Research and Innovation Programme (grant no. 683064). Department of Computer Science, Bar-Ilan University, Israel

Institute of Informatics, University of Warsaw, Poland

Daniel P. Martin

School of Mathematics, University of Bristol, UK

Heilbronn Institute for Mathematical Research, Bristol, UK

Przemysław Uznański

Institute of Computer Science, University of Wrocław, Poland

Abstract

We show that the edit distance between two run-length encoded strings of compressed lengths $m$ and $n$ respectively, can be computed in $\mathcal{O}(mn\log(mn))$ time. This improves the previous record by a factor of $\mathcal{O}(n/\log(mn))$ . The running time of our algorithm is within subpolynomial factors of being optimal, subject to the standard SETH-hardness assumption. This effectively closes a line of algorithmic research first started in 1993.

1 Introduction

The edit distance is one of the most common distance measures between strings. For two strings of length $M$ and $N$ respectively, the edit distance counts the minimum number of single character insertions, deletions and substitutions needed to transform one string into the other. The first record of an $\mathcal{O}(MN)$ algorithm to compute the edit distance is from 1968 [14] although it was rediscovered independently a number of times subsequently. Masek and Paterson improved the running time to $\mathcal{O}(MN/\log{M})$ in 1980 and this is the fastest known algorithm to date [12]. Much more recently it has been shown no $\mathcal{O}(MN^{1-\epsilon})$ time edit distance algorithm can exist, subject to the strong exponential time hypothesis (SETH) [4, 5]. As a result, it is likely that little further progress can be made in terms of improving its worst case complexity.

In this paper we focus on the problem of computing the edit distance between two compressed strings. The run-length encoding (RLE) of a string compresses consecutive identical symbols into a run, denoted $\sigma^{i}$ if the symbol $\sigma$ is repeated $i$ times. For example aaabbbbaaa would be compressed to $\texttt{a}^{3}\texttt{b}^{4}\texttt{a}^{3}$ . This form of compression is commonly used for image compression but also has wider applications including, for example, in image processing [9, 15] and succinct data structures [11].

In 1993 Bunke and Csirik proposed the first algorithm for computing the edit distance between RLE strings. For two strings of RLE-compressed lengths $m$ and $n$ respectively, their algorithm runs in $\mathcal{O}(mn)$ time in the special case where all the runs are of the same length [6]. However the running time falls back to the naive complexity of $\mathcal{O}(MN)$ time in the worst case where $M$ and $N$ are the uncompressed lengths of the two strings. This worst case complexity was subsequently improved to $\mathcal{O}(Nm+Mn)$ [7, 3] and then $\mathcal{O}(\min\{Nm,Mn\})$ time in 2007 [10]. Finally in 2013 the fastest solution prior to this current work was given running in $\mathcal{O}(mn^{2})$ time, where $n\geq m$ [8]. This was the also the first algorithm for the RLE edit distance problem whose running time did not depend on the uncompressed lengths of the input strings.

For uncompressed strings, the longest common subsequence (LCS) problem has long been considered a close relative of the edit distance problem. This is partly due to the similarity of their dynamic programming solutions and partly because LCS is a special case of edit distance when general costs are allowed for the different mismatch and substitution operations. Moreover, the two problems have the same quadratic time upper bounds and SETH-hardness lower bounds [5]. Somewhat surprisingly, however, the history of algorithms for LCS and edit distance have not mirrored each other when the problems are considered on RLE strings. In particular, an $\mathcal{O}(mn\log(mn))$ time algorithm for computing the LCS on RLE strings was given in 1999 [2] which is considerably faster than has been possible up to this point for the edit distance problem. Some work has also been carried out since that date to improve the log factor in the running time complexity for the LCS problem [1, 13].

In this paper we speed up the running time for the edit distance problem on RLE strings by a factor of $\mathcal{O}(n/\log(mn))$ , matching the fastest LCS algorithm to within a logarithmic factor and making it within subpolynomial factors of being optimal, assuming SETH holds. As a result, our new algorithm shows that the LCS and edit distance problems are indeed of essentially the same complexity even when the input strings are run-length encoded. This effectively closes a line of algorithmic research first started in 1993.

Theorem 1.1.

Given two RLE strings of compressed length $n$ and $m$ respectively, there exists an algorithm to compute their edit distance which runs in $\mathcal{O}(mn\log(mn))$ time.

2 Previous Work and Preliminaries

The classic dynamic programming solution for computing the edit distance between uncompressed strings $X$ and $Y$ of uncompressed lengths $M$ and $N$ respectively, computes the distance between all prefixes $X[1,\dots,i]$ and $Y[1,\dots,j]$ . The key recurrence which enables us to do this efficiently is given by:

[TABLE]

From this the classic $O(MN)$ time solution follows directly.

The previous approaches for the edit distance problem on RLE strings take this recurrence and the implied dynamic programming table as their starting point. The basic idea was introduced by Bunke and Csirk [6] whose algorithm works by dividing the dynamic programming table into “blocks”, where each block is defined by a run in the original strings.

For each block the central task is to compute its bottom row and rightmost column given the bottom row of the block above and the rightmost column of the block to the left. For simplicity of terminology, we will refer to the rightmost column of the block to the left and the bottom row of the block above collectively as the input border of a block and the bottom row and rightmost column of a block as its output border. Figure 1 illustrates an example.

In [7, 3] it was shown that the work needed to derive the values of all the output borders of blocks is at most linear in their length. When computing the edit distance between strings $X$ and $Y$ , the length of each row in the dynamic programming table is the uncompressed length of string $Y$ and the length of each column is the uncompressed length of $X$ . If there are $m$ runs in string $X$ and $n$ runs in $Y$ then the total time complexity for computing the edit distance using their approach is therefore $O(Nm+Mn)$ .

The work closest to ours is the $O(mn^{2})$ algorithm of Chen and Chao [8]. They observe that the borders of the blocks in the dynamic programming table are piecewise linear with gradient $\pm 1$ or [math]. The borders can be therefore concisely represented by specifying their starting values as well as the positions and types of the points of changing gradient called the turning points. They prove that for a given block the number of turning points in an output border is at most a constant greater than in its input border. Consequently, a simple calculation shows that the total number of turning points is $\mathcal{O}(mn^{2})$ (for $m\leq n$ ). Chen and Chao arrive at their final complexity by designing a procedure that computes the representation of the output border of a block, given the representation of its input border, in time proportional to the number of turning points. We now summarise their approach using our own notation.

There are two distinct types of blocks in the dynamic programming table. A match block corresponds to a rectangle where the corresponding symbols in the two strings match. A mismatch block corresponds to a rectangle where the corresponding symbols mismatch. Figure 1 shows both match and mismatch blocks. Borrowing notation from [8] we say that element $(i,j)$ of the dynamic programming table ED is in diagonal $j-i$ . Let $(i_{d},j_{d})$ be the intersection of the input border with diagonal $j-i$ .

Lemma 2.1 ([8, Lemma 1]).

For a match block, $\textsf{ED}(i,j)=\textsf{ED}(i_{d},j_{d})$ .

Lemma 2.1 indicates that for a match block we can simply copy the values from the corresponding position in the input border to derive the values of the output border. The main challenge is therefore how to handle mismatch blocks.

For mismatch blocks Chen and Chao’s algorithm, in a similar manner to previous RLE edit distance algorithms, splits the problem into two parts corresponding to shortest paths that pass through the leftmost column or the top row [6, 7, 3, 8]. Consider a mismatch block of height $h$ and width $w$ corresponding to runs $\texttt{a}^{h-1}$ and $\texttt{b}^{w-1}$ such that $h\leq w$ (the other case can be processed similarly by swapping the left and the top part of the input border). $\operatorname{LEFT}[1,h]$ and $\operatorname{TOP}[1,w]$ denotes the values of the left and the top part of the input border, numbered in a bottom-to-top and a left-to-right direction, respectively. For an array $S[1,n]$ and a parameter $h\in\mathbb{Z^{+}}$ , let $S^{(h)}[i]=\min\{S[j]\mid i-h+1\leq j\leq i,1\leq j\leq n\}$ . Chen and Chao separately compute all the output border values that are derived from a value in the left part of the input border, denoted $\operatorname{OUT}_{\operatorname{LEFT}}[1,w+h-1]$ , and similarly compute all the output border values that are derived from a value in the top part of the input border, denoted $\operatorname{OUT}_{\operatorname{TOP}}[1,w+h-1]$ , as follows.

[TABLE]

We start with reformulating the algorithmic framework of Chen and Chao using the following notation.

Definition 2.2.

Let $S[1,n]$ be a 1-indexed array of length $n$ .

•

For a parameter $h\in\mathbb{Z^{+}}$ , $\textsf{SWM}(S,h)$ returns the array $S^{(h)}$ of length $n+h-1$ .

•

$\textsf{split}(S,m)$ * returns the two subarrays $S[1,m]$ , and $S[m+1,n]$ .*

•

$S\pm\overrightarrow{1}$ * returns $S$ with the gradient decreased/increased by one, or formally $S^{\prime}[i]=S[i]\pm i$ .*

•

For an integer constant $c$ , $S+c$ returns $S$ with every value increased by $c$ .

•

$\textsf{initialise}(\ell)$ * returns an array of length $\ell$ initially filled with zeroes.*

•

$\textsf{join}(S_{1},S_{2})$ * simply concatenates two arrays.*

Now, equations (1) and (2) can be rephrased as Algorithms 1 and 2, respectively. The final step of the algorithm is to compute the output border as the minimum of $\operatorname{OUT}_{\operatorname{TOP}}[i]$ and $\operatorname{OUT}_{\operatorname{LEFT}}[i]$ for each index $i$ . This is performed in linear time per block by Chen and Chao [8]. In Section 3 we will design a new implementation of both algorithms and a subtle amortised argument for this final step. The latter is based on the fundamental property of the values in an output border summarised by Lemma 2.3.

Lemma 2.3 ([8, Lemma 7]).

If there exists an $i$ such that $\operatorname{OUT}_{\operatorname{TOP}}[i]\leq\operatorname{OUT}_{\operatorname{LEFT}}[i]$ , then $\operatorname{OUT}_{\operatorname{TOP}}[j]\leq\operatorname{OUT}_{\operatorname{LEFT}}[j]$ for all $j\geq i$ .

Before we go on to explain how we speed up the task of deriving the borders of blocks, it is worth exploring for a moment why we cannot simply apply, perhaps with some small modifications, the known $O(mn\log(mn))$ time solution for LCS on RLE strings [2]. The key obstacle comes from the different nature of optimal paths in the dynamic programming table of the LCS and edit distance problems.

For the LCS problem on RLE strings Apostolico et al. [2] introduced two important concepts. The first is forced paths and the second corner paths. They say that a path beginning at the upper-left corner of a match block is forced if it traverses the block by strictly diagonal moves and, whenever the right (respectively, lower) side of an intermediate match block is reached, proceeds to the next match block by a straight horizontal (respectively, vertical) line through the mismatch blocks in between. A corner path is one that enters match blocks in the top left corner and exits only through the bottom right corner. They show that there is always a longest common subsequence between two strings corresponding to the concatenation of subpaths of corner and forced paths. This fact greatly reduces the number of different paths that have to be considered and hence the complexity of solving the overall LCS problem. However for the edit distance problem this property of forced paths no longer holds. Figures 3 and 3 show an example of this key difference between optimal paths under edit distance and LCS. In Figure 3 we can see that there is no optimal vertical (or horizontal) path through the mismatch block. By contrast, there is indeed an optimal vertical path for the LCS problem as illustrated by Figure 3.

In order to speed up edit distance computation on RLE strings we introduce a new data structure for input borders and output borders. This will allow us to derive the values of output borders from their respective input borders in amortised logarithmic time per border, rather than the previous linear time. The rest of the paper is devoted to this task.

3 Efficient Manipulation of Piecewise-Linear Functions

In this section, we describe the data structure we will use to represent input borders and output borders in the dynamic programming table. We will then show how the operations from Definition 2.2 can be implemented efficiently using this data structure.

Recall that a piecewise linear function is a real-valued function $F$ whose domain $\operatorname{dom}(F)$ is a closed interval that can be represented as a union of closed intervals $\operatorname{dom}(F)=\bigcup_{j=1}^{k}I_{j}$ such that $F$ restricted to $I_{j}$ is an affine function ( $g_{j}x+h_{j}$ for some coefficients $g_{j}$ and $h_{j}$ ). The input and output borders as defined in Section 2 are by definition piecewise linear.

In this section, we impose a few further restrictions:

•

For each integer $x\in\operatorname{dom}(F)$ , the value $F(x)$ is also an integer.

•

The gradient $g_{j}$ of each $F|_{I_{j}}$ is $-1$ , [math], or $1$ .

•

The endpoints of $\operatorname{dom}(F)$ are integers or half-integers.

The graph of a piecewise linear function $F$ is a simple polygonal curve, and thus it can be interpreted as a sequence of turning points connected by straight-line segments. Due to the restrictions imposed on $F$ , each turning point has integer or half-integer coordinates. We represent such a function $F$ as a sequence of segments stored in an annotated balanced binary search tree, where each segment explicitly keeps the coordinates of its endpoints.111Note that the coordinates of each internal turning point are stored with both incident segments.

We first provide a simple implementation of curves supporting a few basic operations, and then we gradually augment it to handle more complicated operations. We conclude with an amortised running time analysis.

3.1 Basic Operations

Our first implementation just stores the corresponding segment for each node $v$ :

$(x_{\ell},y_{\ell})$ :

The coordinates of the left endpoint of the segment corresponding to $v$ .

$(x_{r},y_{r})$ :

The coordinates of the right endpoint of the segment corresponding to $v$ .

Nevertheless, we are already able to implement some operations useful in Algorithms 1 and 2.

Create

The create operation produces a function $F$ whose graph consists of just one segment $S$ with given endpoints $(x_{\ell},y_{\ell})$ and $(x_{r},y_{r})$ . This enables us to implement the whole of line 4 of Algorithm 1 in worst-case constant time.

Join

The join operation takes two functions, $F_{L}$ and $F_{R}$ with domains $\operatorname{dom}(F_{L})=[x_{L},x_{M}]$ and $\operatorname{dom}(F_{R})=[x_{M},x_{R}]$ , respectively, and with a common endpoint $F_{L}(x_{M})=F_{R}(x_{M})$ . It returns a function $F$ with $\operatorname{dom}(F)=[x_{L},x_{R}]$ such that $F_{L}=F|_{[x_{L},x_{M}]}$ and $F_{R}=F|_{[x_{M},x_{R}]}$ . To implement this operation, we first join the two balanced binary search trees. If the rightmost segment of $F_{L}$ has the same gradient as the leftmost segment of $F_{R}$ , we also join these segments. The resulting tree represents $F$ . The worst-case running time is logarithmic.

Split

The split operation takes a function $F$ with $\operatorname{dom}(F)=[x_{L},x_{R}]$ and a value $x_{M}\in\operatorname{dom}(F)$ . It returns two functions $F_{L}=F|_{[x_{L},x_{M}]}$ and $F_{R}=F|_{[x_{M},x_{R}]}$ . To implement it, we first descend the binary search tree to find a segment $S$ with $x_{M}\in\operatorname{dom}(S)$ . If $x_{M}$ lies in the interior of $\operatorname{dom}(S)$ , we split this segment into two. Next, we split the binary search tree to separate the segments to the left of $x_{M}$ from the segments to the right of $x_{M}$ . The resulting two trees represent $F_{L}$ and $F_{R}$ , respectively. The worst-case running time is logarithmic.

Combine

The combine operation takes two functions $F_{1}$ and $F_{2}$ over the same domain $\operatorname{dom}(F_{1})=\operatorname{dom}(F_{2})=[x_{L},x_{R}]$ and returns their pointwise minimum: a function $F$ with $\operatorname{dom}(F)=[x_{L},x_{R}]$ such that $F(x)=\min(F_{1}(x),F_{2}(x))$ for $x\in\operatorname{dom}(F)$ . We assume that there exists $x_{M}\in[x_{L},x_{R}]$ such that $F_{1}(x)>F_{2}(x)$ if $x<x_{M}$ and $F_{1}(x)\leq F_{2}(x)$ if $x\geq x_{M}$ .

If $F_{1}(x_{L})\leq F_{2}(x_{L})$ , then $x_{M}=x_{L}$ . Hence, we return $F=F_{1}$ and discard $F_{2}$ . Similarly, if $F_{1}(x_{R})>F_{2}(x_{R})$ , then $x_{M}=x_{R}$ . Hence, we return $F=F_{2}$ and discard $F_{1}$ .

Otherwise, we are guaranteed that $F_{1}(x_{M})=F_{2}(x_{M})$ . Our first task is to find $x_{M}$ . For this, we locate segments $S_{1}$ of $F_{1}$ and $S_{2}$ of $F_{2}$ such that $x_{M}\in\operatorname{dom}(S_{1})\cap\operatorname{dom}(S_{2})$ .

We observe that $S_{1}$ corresponds to the leftmost node $v$ in the BST of $F_{1}$ such that $v.y_{r}=F_{1}(v.x_{r})\leq F_{2}(v.x_{r})$ . Hence, we perform a left-to-right in-order traversal of the BST to find $S_{1}$ . For each visited node $v$ , we evaluate $F_{2}(v.x_{r})$ by descending the BST of $F_{2}$ to find a segment whose domain contains $v.x_{r}$ . Symmetrically, $S_{2}$ corresponds to the rightmost node $v$ in the BST of $F_{2}$ such that $F_{1}(v.x_{\ell})>F_{2}(v.x_{\ell})=v.y_{\ell}$ , so we find $S_{2}$ by performing a right-to-left in-order traversal of the BST.

Next, we note that $(x_{M},F_{1}(x_{M}))=(x_{M},F_{2}(x_{M}))$ is the leftmost common point of $S_{1}$ and $S_{2}$ . Hence, we can now compute $x_{M}$ easily (the restrictions on $F_{1}$ and $F_{2}$ guarantee that it is an integer or a half-integer). Finally, we split both $F_{1}$ and $F_{2}$ at $x_{M}$ , discard $F_{1}|_{[x_{L},x_{M}]}$ and $F_{2}|_{[x_{M},x_{R}]}$ , join $F_{2}|_{[x_{L},x_{M}]}$ with $F_{1}|_{[x_{M},x_{R}]}$ , and return the resulting function as $F$ .

As far as the running time is concerned, the cost is logarithmic for each discarded segment. We can now also implement the final combine step that produces our representation of the output border from the outputs of Algorithms 1 and 2 by finding the minimum at each index.

3.2 Shifts

Next, we extend our data structure to implement the shift operation which moves the whole function by a given vector. It is useful in Algorithms 1 and 2 for altering $S^{\ell}$ and $S^{r}$ .

Formally, given a function $F$ with $\operatorname{dom}(F)=[x_{L},x_{R}]$ and a vector $(\Delta_{x},\Delta_{y})$ , we transform $F$ into $F^{\prime}$ such that $F^{\prime}(x)=F(x-\Delta_{x})+\Delta_{y}$ for each $x\in\operatorname{dom}(F^{\prime})=[x_{L}+\Delta_{x},x_{R}+\Delta_{x}]$ .

This update is performed using a technique known as lazy propagation. We augment each node $v$ with the following extra field:

$(\delta_{x},\delta_{y})$ :

A deferred shift to be propagated within the subtree of $v$ .

This change is then lazily propagated as further operations are executed. Here, we rely on a key structural property of BST operations:

Observation 3.1.

The execution of every BST operation can be extended (at the cost of an extra multiplicative constant in the running time) with a sequence of node activations and deactivations such that:

•

a node $v$ is accessed only when it is active and has no active descendant,

•

when $v$ is active, then all its ancestors are active,

•

no node is active at the beginning and the end of the execution.

The idea behind lazy propagation is that the deferred updates stored at a node $v$ are propagated when $v$ is activated. This way, every active node has no delayed updates pending. Hence, from the perspective of any other operation, the effect is the same as if we have meticulously modified every node for each update.

The shift propagation is very simple: when a node $v$ receives a request for a shift by $(\Delta_{x},\Delta_{y})$ , then we just add $(\Delta_{x},\Delta_{y})$ to the delayed shift $(v.\delta_{x},v.\delta_{y})$ stored at $v$ . Upon activation of $v$ , we propagate $(v.\delta_{x},v.\delta_{y})$ to the children of $v$ , add $(v.\delta_{x},v.\delta_{y})$ to both $(x_{\ell},y_{\ell})$ and $(x_{r},y_{r})$ , and reset $(v.\delta_{x},v.\delta_{y}):=(0,0)$ . To implement the shift operation, we just send a request for a shift by $(\Delta_{x},\Delta_{y})$ to the root node $r$ .

The worst-case running time of a shift is constant, and the extra cost of propagation does not increase the asymptotic running time of the remaining operations.

3.3 Gradient Changes

The gradient change operation takes a function $F$ and a coefficient $\Delta_{g}$ , and it transforms $F$ into $F^{\prime}$ such that $F^{\prime}(x)=F(x)+\Delta_{g}\cdot x$ for each $x\in\operatorname{dom}(F^{\prime})=\operatorname{dom}(F)$ . This operation is needed in both Algorithms 1 and 2 to transform $S^{r}$ and $S^{\ell}$ , respectively.

We first note that the constraints imposed on the gradients of functions $F$ and $F^{\prime}$ yield that $\Delta_{g}=-1$ , $F$ is non-decreasing, and $F^{\prime}$ is non-increasing, or $\Delta_{g}=1$ , $F$ is non-increasing, and $F^{\prime}$ is non-decreasing. However, these limitations only become relevant in Section 3.4.

To implement gradient change, we just add another field to each node $v$ :

$\delta_{g}$ :

A deferred gradient change to be propagated within the subtree of $v$ .

We now have two types of lazily propagated updates: shift and gradient change. These two operations do not commute, so we need decide how to interpret the two kinds of deferred updates stored at a node $v$ . We shall assume that the gradient change by $\delta_{g}$ is to be performed before the shift by $(\delta_{x},\delta_{y})$ .

Thus, while shift propagation is implemented as in Section 3.2, adding $\Delta_{g}$ to $v.\delta_{g}$ is insufficient when a node $v$ receives a request to change gradient by $\Delta_{g}$ : we also need to add $\Delta_{g}\cdot\delta_{x}$ to $\delta_{y}$ . This approach is correct since a shift by $(\Delta_{x},\Delta_{y})$ followed by a gradient change by $\Delta_{g}$ is equivalent to a gradient change by $\Delta_{g}$ followed by a shift by $(\Delta_{x},\Delta_{y}+\Delta_{g}\cdot\Delta_{x})$ .

Upon activation of $v$ , we first apply the deferred gradient change: we propagate it to the children of $v$ , increase $v.y_{\ell}$ by $v.\delta_{g}\cdot v.x_{\ell}$ and $v.y_{r}$ by $v.\delta_{g}\cdot v.x_{r}$ , and reset $v.\delta_{g}=0$ . Then, we handle the deferred shift as in Section 3.2.

Finally, we note that to implement the gradient change operation, we just send a request for a gradient change by $\Delta_{g}$ to the root node $r$ . The worst-case running time is constant.

3.4 Sliding Window Minima

We can finally show how to implement the SWM operation efficiently on our data structure. This is the most involved of the operations we will need. The SWM operation given a function $F$ with $\operatorname{dom}(F)=[x_{L},x_{R}]$ and a window width $t$ , returns a function $F^{\prime}$ with $\operatorname{dom}(F^{\prime})=[x_{L},x_{R}+t]$ such that $F^{\prime}(x)=\min\{F(x^{\prime}):x^{\prime}\in[x,x-t]\cap\operatorname{dom}(F)\}$ .

Combinatorial Properties

We begin by observing that the SWM operation is composable.

Observation 3.2.

Every function $F$ and positive window widths $t,t^{\prime}$ satisfy $\textsf{SWM}(\textsf{SWM}(F,t),t^{\prime})=\textsf{SWM}(F,t+t^{\prime})$ .

Hence, instead of applying $\textsf{SWM}(\cdot,t)$ for an integer width $t$ , we may equivalently apply the $\textsf{SWM}(\cdot,1)$ operation $t$ times. The key property of width $1$ is that the changes to the transformed function are very local. The structure of these modifications can be described in terms of types of turning points. We classify internal turning points by the gradients (Increasing, Flat, or Decreasing) of the incident segments; see Table 1, where we also analyse how a function changes in the vicinity of each turning point subject to $\textsf{SWM}(\cdot,1)$ .

•

A point of type FD or DF remains intact.

•

A point of type FI or IF is shifted by $(1,0)$ .

•

A point of type ID is shifted by $(0.5,-0.5)$ .

•

A point of type DI transformed into a point of type DF and a point of type FI, and the latter is shifted by $(1,0)$ .

Note that the behaviour of DI points is unlike that of other types. However, this discrepancy disappears if we replace every DI point with two coinciding points of types DF and FI, respectively, with an artificial length-0 segment in between. Hence, whenever a new internal turning point is created (which happens only within the join operation), if the turning point would be of type DI, we pre-emptively replace it by two coinciding points of type DF and FI, respectively. Note that the resulting length-0 segment never changes its gradient since gradient change is allowed only on a monotone function. However, when an incident segment is modified, we may need to remove the length-0 segment. This process cannot cascade, though, causing another length zero segment to be removed.

Next, we analyse in Table 2 how the $\textsf{SWM}(F,1)$ operation affects the endpoints of the graph of $F$ . In most cases, the left endpoint stays intact and the right endpoint is shifted by $(0,1)$ . The only exceptions are endpoints of type -I and D-, which exhibit similar behaviour to the internal turning points of type DI. Moreover, this discrepancy also disappears if we introduce artificial flat segments of length 0. Hence, we replace a point of type -I with two points of type -F and FI, respectively, and a point of type D- with two points of type DF and F-, respectively. However, this time the replacement is not pre-emptive: we perform it as the first step in the implementation of the SWM operation. This is possible because there are just two endpoints, while the number of internal turning points of type DI could be large. Our gain, on the other hand, is that we avoid length-0 segments changing their gradients.

With the artificial length-0 segments in place, it is now true that the effect of the SWM operation on each turning point can be described as a shift depending only on the type of the point. As a result of these shifts, some segments may disappear as their length reaches 0; in this case, we say that a segment collapses. Only segments of three kinds may collapse:

•

a segment between a point of type ID and point of type DF;

•

a segment between a point of type IF and point of type FD;

•

a segment between a point of type FI and point of type ID;

When a segment collapses, it is removed and the two incident turning points are merged.222Two adjacent segments may collapse simultaneously. In that special case, three subsequent points, of type FI, ID, and DF, respectively, need to be deleted. Each segment of the three affected kinds has the collapse time, defined as the smallest $t$ for which $\textsf{SWM}(\cdot,t)$ makes it collapse (assuming no interaction from incident segments) equal the the Manhattan distance between its endpoints. Note that due to the restrictions on the piecewise linear functions considered in this section, the collapse time is always an integer.

Implementation

To implement the SWM operation, we augment each node $v$ with the following fields:

$\textrm{type}_{\ell},\textrm{type}_{r}$ :

The types of the turning points joined by the segment corresponding to $v$ .

$\delta_{t}$ :

The amount of a deferred SWM to be propagated within the subtree of $v$ .

$t_{\min}$ :

The minimum collapse time among the segments in the subtree of $v$ .

Note that the type of each internal turning point is stored twice. Hence, whenever a node type changes, this fact needs to be reflected at both incident segments (and we need to reach the corresponding nodes by descending the BST; shortcuts would violate 3.1).

The field $v.t_{\min}$ is of a kind we have not encountered yet: its value depends on the corresponding values for the children of $v$ and on other fields at $v$ . It is brought up to date whenever $v$ is deactivated (so that it can be accessed only when $v$ is inactive). We shall assume that its value already reflects the deferred updates stored at $v$ . The procedure of recomputing $v.t_{\min}$ is simple: we determine the collapse time of the segment represented by $v$ (which is infinite or equal to $|v.x_{r}-v.x_{\ell}|+|v.y_{r}-v.y_{\ell}|$ depending on the types of the incident turning points), and take the minimum of this value and $u.t_{\min}$ for every child $u$ of $v$ . Since $v$ has no deferred changes when it is deactivated, the resulting minimum is $v.t_{\min}$ .

Propagation

The main structural modification to the lazy propagation procedures is that we maintain an additional invariant that no deferred changes are stored on the leftmost and on the rightmost path of the BSTs representing every function $F$ . To maintain this invariant, immediately after lazily updating of the whole $F$ (sending a request to the root node $r$ ), we descend to the leftmost and to the rightmost segment $F$ ; this increases the cost of shift and gradient change to logarithmic. Note that the split operation must anyway visit the nodes representing the new boundary segments (to update the types of new endpoints). Moreover, if a path from the root to a given node $v$ contains no deferred updates, then this is still true after any rebalancing of the BST (as only active nodes get rotated).

Concerning the lazy SWM propagation, we explicitly forbid requesting for SWM with window width exceeding $r.t_{\min}$ , because collapsed segments need to be removed before we proceed further. Also, the window widths (and hence the values $\delta_{t}$ ) are always non-negative.

We have three kinds of deferred updates now: SWM, gradient change, and shift. We fix the semantics of the fields $\delta_{t}$ , $\delta_{g}$ , and $(\delta_{x},\delta_{y})$ so that an SWM of width $\delta_{t}$ is performed first, a gradient change by $\delta_{g}$ second, and a shift by $(\delta_{x},\delta_{y})$ last. The requests for a shift and for a gradient change are still implemented as in Section 3.3; note that these updates do not affect the collapse times (the three segment kinds with finite collapse times cannot appear in monotone functions). On the other hand, the request for an SWM with a window width $\Delta_{t}$ requires more care. We clearly need to increase $\delta_{t}$ by $\Delta_{t}$ and decrease $t_{\min}$ by the same amount. The aforementioned steps suffice if $\delta_{g}=0$ . Otherwise, we note that the turning points in the subtree of $v$ are all of types DF and FD or all of types IF and FI. (Observe that there are no deferred changes in the proper ancestors of $v$ and that $v$ is not on the leftmost or rightmost path in this case.) We can easily distinguish the two cases by analysing the endpoints of the segment corresponding to $v$ . Moreover, the SWM operation is void in the first of these cases, and in the second one it reduces to a shift by $(\Delta_{t},0)$ . Hence, we shall implement it this way rather than by modifying $v.\delta_{t}$ .

The propagation itself is relatively easy: upon activation of a node $v$ , we first propagate the SWM operation to the children of $v$ , update the endpoints of the segment corresponding to $v$ (according to Tables 1 and 2, with the shift multiplied by $\delta_{t}$ ), and finally reset $\delta_{t}=0$ . Then, we propagate the gradient change and the shift. This is implemented as in Section 3.3 except that the gradient change now affects not only the coordinates but also the types of the segment’s endpoints.

SWM Procedure

To implement the SWM procedure itself, we first check the endpoint types and perform appropriate substitutions for endpoints of type -I and D-. Next, we would like to lazily apply the SWM operation with window width $t$ to the root $r$ . However, this could result in negative collapse time $r.t_{\min}$ , so instead we perform SWM gradually based on 3.2. If $r.t_{\min}<t$ , we make a request for SWM with window width just $r.t_{\min}$ , leaving the remaining quantity $t-r.t_{\min}$ for later on. This already results in $r.t_{\min}=0$ , which indicates that there is a collapsed segment. We descend the tree to find such a collapsed segment (activating nodes on the way there and deactivating them on the way back), and take care of this segment appropriately (this may affect neighbouring segments as well). We repeat the process as long as $r.t_{\min}=0$ . Once this value is positive again, we are ready to proceed with further application of SWM.

As far as the the running time is concerned, the cost of SWM consists of a logarithmic term for visiting the endpoints and further logarithmic terms for each collapsed segment.

3.5 Complexity Analysis

We complete this section by showing that the aforementioned operations run in amortised logarithmic time.

Lemma 3.3.

A sequence of $k$ operations on piecewise linear functions takes $\mathcal{O}(k\log k)$ time.

Proof.

Our potential is $\log k$ multiplied by the total number of turning points in all the stored functions. First, we observe that this potential grows by $\mathcal{O}(\log k)$ : each operation creates a constant number of new turning points. In particular, the total number of turning points is $\mathcal{O}(k)$ , so manipulating BSTs takes $\mathcal{O}(\log k)$ time. Next, we note that the worst-case running time of most operations is $\mathcal{O}(\log k)$ , with extra $\mathcal{O}(\log k)$ time needed for each discarded or collapsed segment. However, every such segment decreases the potential by $\log k$ . ∎

4 An $\mathcal{O}(mn\log(mn))$ time RLE Edit Distance Algorithm

As in the previous algorithm by Chen and Chao [8], we go through the dynamic programming table block by block. For every block, we transform the representation of its input border to the representation of its output border. As mentioned earlier, borders are piecewise linear with gradient $\pm 1$ or [math] so they can be maintained in the structure described in Section 3. We will assume that the left and the top part of the input border of every block are stored in separate structures. We start by generating the structures corresponding to the left and the top border of the whole dynamic programming table. These left and top borders are each a single decreasing and increasing sequence, respectively. As a result, we can generate the data structure for the parts corresponding to all blocks trivially in $\mathcal{O}(m+n)$ time using $m+n$ create operations. Now, we have to describe how to obtain the structure corresponding to the right and the bottom part of the output border of the current block given the structures corresponding to the left and the top part of its input border. We stress that any structure will be created and then used once as an input to a further operation, which is crucial for the amortisation argument within Lemma 3.3.

Recall that the semantics of split and join operating on arrays in Section 2 and of split and join operating on piecewise linear functions in Section 3 is slightly different: split now creates two functions that both contain the value of the original function at $x_{M}$ ; symmetrically, join takes two functions defined on $[x_{L},x_{M}]$ and $[x_{M},x_{R}]$ that share the same value at $x_{M}$ . This is, however, not an issue because the cases in both (1) and (2) overlap at the boundaries.

For a match block, the value stored in an element $(i,j)$ of the output border is a copy of the value stored in the corresponding element $(i_{d},j_{d})$ of the input border. Recalling that $(i_{d},j_{d})$ is the intersection of the input border with diagonal $j-i$ , this can be readily implemented with a constant number of split and join operations.

For a mismatch block, we need to apply Algorithms 1 and 2, merge the returned solutions by taking the minimum at every position, and finally create separate structures corresponding to the right and the bottom part of the output border with a single split operation. Note that while we have already observed that both input border and output border are piecewise linear with gradient $\pm 1$ or [math], we need to make sure that the same is true for every function obtained inside Algorithms 1 and 2, and for $\operatorname{OUT}_{\operatorname{TOP}}$ and $\operatorname{OUT}_{\operatorname{LEFT}}$ in particular.

Lemma 4.1.

Every function obtained in Algorithms 1 and 2 is piecewise linear with gradient $\pm 1$ or [math].

Proof.

Consider Algorithm 1. It is easy to verify that $S$ and hence also $S^{\ell}$ and $S^{r}$ are indeed piecewise linear with gradient $\pm 1$ or [math]. Additionally, $S^{\ell}[i]$ is equal to the minimum in $\operatorname{LEFT}[1,i]$ and so $S^{\ell}$ is non-increasing. Consequently, $S_{1}$ , $S_{2}$ , and $S_{3}$ are all piecewise linear with gradient $\pm 1$ or [math]. We only need to verify that the same holds for their concatenation. This is true because each of these three parts corresponds to a case considered in (1), and these cases overlap at the boundaries.

Next, consider Algorithm 2. Similarly as above, it is easy to verify that $S$ and so also $S^{\ell}$ and $S^{r}$ are piecewise linear with gradient $\pm 1$ or [math]. Furthermore, $S^{r}[i]$ is equal to the minimum in $\operatorname{TOP}[w-h+i+1,w]$ and so $S^{r}$ is non-decreasing. Thus, $S_{1}$ and $S_{2}$ are piecewise linear with gradient $\pm 1$ or [math] and the same holds for their concatenation because the cases in (2) overlap at the boundaries. ∎

We now explain in detail how to implement Algorithm 1. We start with computing $S^{\ell}$ and $S^{r}$ by first calling $\mathsf{SWM}(\operatorname{LEFT},h-1)$ and then using split. Next, $S_{1}$ is obtained by applying gradient change and shift to $S^{\ell}$ , $S_{2}$ is obtained by calling create, and $S_{3}$ is obtained by applying shift to $S^{r}$ . Finally, $\operatorname{OUT}_{\operatorname{LEFT}}$ is created with two calls to join.

Algorithm 2 is implemented by calling $\mathsf{SWM}(\operatorname{TOP},w-1)$ and then using split. Next, $S_{1}$ is obtained by applying shift to $S^{\ell}$ , while $S_{2}$ is obtained by applying gradient change and shift to $S^{r}$ . Finally, $\operatorname{OUT}_{\operatorname{TOP}}$ is created by a single call to join.

Having obtained a representation of $\operatorname{OUT}_{\operatorname{LEFT}}$ and $\operatorname{OUT}_{\operatorname{TOP}}$ , we call combine to obtain a representation of the output border. Such a call is valid due to Lemma 2.3. The overall number of operations on structures is $\mathcal{O}(mn)$ , making the final time complexity $\mathcal{O}(mn\log(mn))$ by Lemma 3.3.

Bibliography15

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Hsing-Yen Ann, Chang-Biau Yang, Chiou-Ting Tseng, and Chiou-Yi Hor. A fast and simple algorithm for computing the longest common subsequence of run-length encoded strings. Information Processing Letters , 108(6):360–364, 2008. doi:10.1016/j.ipl.2008.07.005 . · doi ↗
2[2] Alberto Apostolico, Gad M. Landau, and Steven Skiena. Matching for run-length encoded strings. Journal of Complexity , 15(1):4–16, 1999. doi:10.1006/jcom.1998.0493 . · doi ↗
3[3] Ora Arbell, Gad M. Landau, and Joseph S. B. Mitchell. Edit distance of run-length encoded strings. Information Processing Letters , 83(6):307–314, 2002. doi:10.1016/S 0020-0190(02)00215-6 . · doi ↗
4[4] Arturs Backurs and Piotr Indyk. Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). SIAM Journal on Computing , 47(3):1087–1097, 2018. doi:10.1137/15M 1053128 . · doi ↗
5[5] Karl Bringmann and Marvin Künnemann. Quadratic conditional lower bounds for string problems and dynamic time warping. In Venkatesan Guruswami, editor, 56th Annual Symposium on Foundations of Computer Science, FOCS 2015 , pages 79–97. IEEE Computer Society, 2015. doi:10.1109/FOCS.2015.15 . · doi ↗
6[6] Horst Bunke and János Csirik. An algorithm for matching run-length coded strings. Computing , 50(4):297–314, 1993. doi:10.1007/BF 02243873 . · doi ↗
7[7] Horst Bunke and János Csirik. An improved algorithm for computing the edit distance of run-length coded strings. Information Processing Letters , 54(2):93–96, 1995. doi:10.1016/0020-0190(95)00005-W . · doi ↗
8[8] Kuan-Yu Chen and Kun-Mao Chao. A fully compressed algorithm for computing the edit distance of run-length encoded strings. Algorithmica , 65(2):354–370, 2013. doi:10.1007/s 00453-011-9592-4 . · doi ↗

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

RLE edit distance in near optimal time

Abstract

1 Introduction

Theorem 1.1**.**

2 Previous Work and Preliminaries

Lemma 2.1** ([8, Lemma 1]).**

Definition 2.2**.**

Lemma 2.3** ([8, Lemma 7]).**

3 Efficient Manipulation of Piecewise-Linear Functions

3.1 Basic Operations

Create

Join

Split

Combine

3.2 Shifts

Observation 3.1**.**

3.3 Gradient Changes

3.4 Sliding Window Minima

Combinatorial Properties

Observation 3.2**.**

Implementation

Propagation

SWM Procedure

3.5 Complexity Analysis

Lemma 3.3**.**

Proof.

4 An O(mnlog⁡(mn))\mathcal{O}(mn\log(mn))O(mnlog(mn)) time RLE Edit Distance Algorithm

Lemma 4.1**.**

Proof.

Theorem 1.1.

Lemma 2.1 ([8, Lemma 1]).

Definition 2.2.

Lemma 2.3 ([8, Lemma 7]).

Observation 3.1.

Observation 3.2.

Lemma 3.3.

4 An $\mathcal{O}(mn\log(mn))$ time RLE Edit Distance Algorithm

Lemma 4.1.