An Efficient Shift Rule for the Prefer-Max De Bruijn Sequence

Gal Amram; Yair Ashlagi; Amir Rubin; Yotam Svoray; Moshe Schwartz,; Gera Weiss

arXiv:1706.01106·cs.DM·September 24, 2018

An Efficient Shift Rule for the Prefer-Max De Bruijn Sequence

Gal Amram, Yair Ashlagi, Amir Rubin, Yotam Svoray, Moshe Schwartz,, Gera Weiss

PDF

Open Access

TL;DR

This paper introduces a new, efficient shift rule for generating prefer-max De Bruijn sequences applicable to all sequence orders and alphabets, with an algorithm that is linear in sequence order.

Contribution

It formulates a universal shift rule for prefer-max De Bruijn sequences and provides a linear-time, memory-efficient algorithm for its implementation.

Findings

01

The shift rule is applicable to all sequence orders and alphabets.

02

The algorithm operates in linear time and memory relative to sequence order.

03

The method improves efficiency over previous approaches.

Abstract

A shift rule for the prefer-max De Bruijn sequence is formulated, for all sequence orders, and over any finite alphabet. An efficient algorithm for this shift rule is presented, which has linear (in the sequence order) time and memory complexity.

Equations46

L_{0}^{r_{0}} < L_{1}^{r_{1}} < \dots < L_{Z_{k} (n) - 1}^{r_{Z_{k} (n) - 1}},

L_{0}^{r_{0}} < L_{1}^{r_{1}} < \dots < L_{Z_{k} (n) - 1}^{r_{Z_{k} (n) - 1}},

00 < 01 < 02 < 11 < 12 < 22

00 < 01 < 02 < 11 < 12 < 22

L_{0}

L_{0}

next (σ w) ≜ ⎩ ⎨ ⎧ w (σ + 1) w (min S) w σ if σ \neq = k - 1 and head (w σ), if σ = k - 1 and S ≜ {σ^{'} \in [k - 1] : head (w σ^{'})} \neq = \emptyset, otherwise,

next (σ w) ≜ ⎩ ⎨ ⎧ w (σ + 1) w (min S) w σ if σ \neq = k - 1 and head (w σ), if σ = k - 1 and S ≜ {σ^{'} \in [k - 1] : head (w σ^{'})} \neq = \emptyset, otherwise,

L_{i}^{r_{i}}

L_{i}^{r_{i}}

next^{∣ L_{i} ∣} (L_{i}^{r_{i}}) = next^{∣ w ∣ + 1 + t} (w σ (k - 1)^{t} L_{i}^{r_{i} - 1}) .

next^{∣ L_{i} ∣} (L_{i}^{r_{i}}) = next^{∣ w ∣ + 1 + t} (w σ (k - 1)^{t} L_{i}^{r_{i} - 1}) .

next (τ w_{2} σ (k - 1)^{t} L_{i}^{r_{i} - 1} w_{1}) = w_{2} σ (k - 1)^{t} L_{i}^{r_{i} - 1} w_{1} τ .

next (τ w_{2} σ (k - 1)^{t} L_{i}^{r_{i} - 1} w_{1}) = w_{2} σ (k - 1)^{t} L_{i}^{r_{i} - 1} w_{1} τ .

R^{m} (w_{2} σ (k - 1)^{t} L_{i}^{r_{i} - 1} w_{1} τ) = w_{3} σ (k - 1)^{t} L_{i}^{r_{i} - 1} w_{1} τ (k - 1)^{m}

R^{m} (w_{2} σ (k - 1)^{t} L_{i}^{r_{i} - 1} w_{1} τ) = w_{3} σ (k - 1)^{t} L_{i}^{r_{i} - 1} w_{1} τ (k - 1)^{m}

w_{3} σ (k - 1)^{t} L_{i}^{r_{i} - 1} w_{1} τ (k - 1)^{m} = R^{∣ w_{1} ∣ + 1 + m} (L_{i}^{r_{i}}) .

w_{3} σ (k - 1)^{t} L_{i}^{r_{i} - 1} w_{1} τ (k - 1)^{m} = R^{∣ w_{1} ∣ + 1 + m} (L_{i}^{r_{i}}) .

R^{m} (w_{2} σ (k - 1)^{t} L_{i}^{r_{i} - 1} w_{1} σ^{'}) = w_{3} σ (k - 1)^{t} L_{i}^{r_{i} - 1} w_{1} σ^{'} (k - 1)^{m}

R^{m} (w_{2} σ (k - 1)^{t} L_{i}^{r_{i} - 1} w_{1} σ^{'}) = w_{3} σ (k - 1)^{t} L_{i}^{r_{i} - 1} w_{1} σ^{'} (k - 1)^{m}

w_{3} σ (k - 1)^{t} L_{i}^{r_{i} - 1} w_{1} (k - 1)^{m + 1} = R^{∣ w_{1} ∣ + 1 + m} (L_{i}^{r_{i}})

w_{3} σ (k - 1)^{t} L_{i}^{r_{i} - 1} w_{1} (k - 1)^{m + 1} = R^{∣ w_{1} ∣ + 1 + m} (L_{i}^{r_{i}})

R^{t} ((k - 1)^{t} L_{i}^{r_{i} - 1} w σ) = L_{i}^{r_{i} - 1} w σ (k - 1)^{t} = L_{i}^{r_{i}}

R^{t} ((k - 1)^{t} L_{i}^{r_{i} - 1} w σ) = L_{i}^{r_{i} - 1} w σ (k - 1)^{t} = L_{i}^{r_{i}}

next ((k - 1)^{j + 1} L_{i}^{r_{i} - 1} w (σ + 1) x_{1}) = (k - 1)^{j} L_{i}^{r_{i} - 1} w (σ + 1) x_{1} τ .

next ((k - 1)^{j + 1} L_{i}^{r_{i} - 1} w (σ + 1) x_{1}) = (k - 1)^{j} L_{i}^{r_{i} - 1} w (σ + 1) x_{1} τ .

x_{1} σ^{'} (k - 1)^{j} < x_{1} (k - 1) x_{2} = x_{1} τ x_{2} = x,

x_{1} σ^{'} (k - 1)^{j} < x_{1} (k - 1) x_{2} = x_{1} τ x_{2} = x,

τ = τ_{m i n} ≜ min {τ^{'} \in [k] : head ((k - 1)^{j} L_{i}^{r_{i} - 1} w (σ + 1) x_{1} τ^{'})} .

τ = τ_{m i n} ≜ min {τ^{'} \in [k] : head ((k - 1)^{j} L_{i}^{r_{i} - 1} w (σ + 1) x_{1} τ^{'})} .

x_{1} τ_{m i n} (k - 1)^{j} < x_{1} τ x_{2} = x,

x_{1} τ_{m i n} (k - 1)^{j} < x_{1} τ x_{2} = x,

next (L_{Z_{k} (n) - 2}^{r_{Z_{k} (n) - 2}}) = next ((k - 2) (k - 1)^{n - 1}) = (k - 1)^{n} .

next (L_{Z_{k} (n) - 2}^{r_{Z_{k} (n) - 2}}) = next ((k - 2) (k - 1)^{n - 1}) = (k - 1)^{n} .

ν_{s_{i + 1}} \dots ν_{s_{i + 1} + n - 1} = next^{∣ L_{i} ∣} (ν_{s_{i}} \dots ν_{s_{i} + n - 1}) = next^{∣ L_{i} ∣} (L_{i}^{r_{i}}) = L_{i + 1}^{r_{i + 1}} .

ν_{s_{i + 1}} \dots ν_{s_{i + 1} + n - 1} = next^{∣ L_{i} ∣} (ν_{s_{i}} \dots ν_{s_{i} + n - 1}) = next^{∣ L_{i} ∣} (L_{i}^{r_{i}}) = L_{i + 1}^{r_{i + 1}} .

ν_{s_{Z_{k} (n) - 2} + 1} \dots ν_{s_{Z_{k} (n) - 2} + n}

ν_{s_{Z_{k} (n) - 2} + 1} \dots ν_{s_{Z_{k} (n) - 2} + n}

= next (L_{Z_{k} (n) - 2}^{r_{Z_{k} (n) - 2}}) = next ((k - 2) (k - 1)^{n - 1})

= (k - 1)^{n},

ν_{s_{Z_{k} (n) - 2} + n} = ν_{s_{Z_{k} (n) - 1}} = ν_{k^{n} - 1} = k - 1 = L_{Z_{k} (n) - 1} .

ν_{s_{Z_{k} (n) - 2} + n} = ν_{s_{Z_{k} (n) - 1}} = ν_{k^{n} - 1} = k - 1 = L_{Z_{k} (n) - 1} .

next (ν_{k^{n} - i} \dots ν_{k^{n} - 1} ν_{0} \dots ν_{n - 1 - i})

next (ν_{k^{n} - i} \dots ν_{k^{n} - 1} ν_{0} \dots ν_{n - 1 - i})

= (k - 1)^{i - 1} 0^{n - i + 1}

= ν_{k^{n} - i + 1} \dots ν_{k^{n} - 1} ν_{0} \dots ν_{n - i} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Coding theory and cryptography · Cellular Automata and Applications

Full text

An Efficient Shift Rule for the

Prefer-Max De Bruijn Sequence111This work was supported in part by the Israeli Science Foundation (ISF) under grant no. 130/14, and grant no. 856/16.

Gal Amram

Yair Ashlagi

Amir Rubin

Yotam Svoray

Moshe Schwartz

Gera Weiss

Department of Computer Science, Ben-Gurion University of The Negev

Department of Electrical and Computer Engineering, Ben-Gurion University of The Negev

Abstract

A shift rule for the prefer-max De Bruijn sequence is formulated, for all sequence orders, and over any finite alphabet. An efficient algorithm for this shift rule is presented, which has linear (in the sequence order) time and memory complexity.

keywords:

De Bruijn sequence , Ford sequence , prefer-max sequence , shift rule

MSC:

[2010] 94A55 , 05C45 , 05C38

††journal: Discrete Mathematics

\addtokomafont

labelinglabel

1 Introduction

A $k$ -ary De Bruijn sequence of order $n$ (denoted $(n,k)$ -DB), is a word $\left\langle{\nu_{i}}\right\rangle_{i=0}^{k^{n}-1}\triangleq\nu_{0}\nu_{1}\dots\nu_{k^{n}-1}$ over the alphabet $[k]\triangleq\left\{{0,1,\dots,k-1}\right\}$ , i.e., $\nu_{i}\in[k]$ for all $i\in{\mathbb{Z}}_{k^{n}}$ , such that all $n$ -subwords $\nu_{i}\nu_{i+1}\dots\nu_{i+n-1}$ are distinct (note that $i\in{\mathbb{Z}}_{k^{n}}$ means that indices are taken modulo $k^{n}$ ). Of the many $(n,k)$ -DB sequences that exist, a particular sequence stands out, featuring in many past works. Consider first the binary case, $k=2$ , start the sequence with $0^{n}$ , and add bit by bit, always preferring to append a $1$ , unless it creates an $n$ -word that has already been seen previously. After obtaining a sequence of length $2^{n}$ , move the $0^{n}$ prefix to the end of the sequence. The result is an $(n,2)$ -DB sequence dubbed the prefer-one sequence or Ford sequence [10, 6]. By complementing all the bits, we obtain the prefer-min $(n,2)$ -DB sequence. In the non-binary case, we can replace the prefer-one rule by a prefer-max (assuming a lexicographical order of the alphabet), resulting in the lexicographically largest $(n,k)$ -DB sequence, and symmetrically (by complementation), a prefer-min $(n,k)$ -DB sequence which is the lexicographically smallest $(n,k)$ -DB sequence.

The greedy bit-by-bit algorithm of [10] is certainly an inefficient way of generating the prefer-max sequence, running in $\Theta(nk^{n})$ time (integer operations), and requiring $\Theta(k^{n})$ memory. Several suggestions have been made since to improve the efficiency of the sequence construction. Fredricksen and Kessler [8], and Fredricksen and Maiorana [9] showed that the prefer-max sequence is in fact a concatenation of certain (Lyndon) words. While seemingly an inefficient way to generate the prefer-max sequence, a later careful analysis [11] has shown that this decomposition allows us to generate the sequence of length $k^{n}$ in $O(k^{n})$ time.

However, another equally important way of generating sequences is of interest, namely, by using a shift rule. It is well known that any $(n,k)$ -DB sequence $\left\langle{\nu_{i}}\right\rangle_{i=0}^{k^{n}-1}$ can be generated by a feedback shift register (FSR) of order $n$ , i.e., there exists a shift-rule function $f\colon[k]^{n}\to[k]$ such that $\nu_{i+n}=f(\nu_{i},\nu_{i+1},\dots,\nu_{i+n-1})$ for all $i\in{\mathbb{Z}}_{k^{n}}$ . Several efficiently computable shift rules for De Bruijn sequences are known, requiring $O(n)$ time and memory to generate the next letter in the sequence, given the preceding $n$ letters (see [5], as well as [12] for a comprehensive list). We also mention the recent [3], which describes an efficient shift rule for the $k$ -ary “grandmama” sequence (which is defined by a co-lexicographic order, compared with the lexicographic order of the prefer-max sequence). However, with a single exception, they only generate non prefer-max sequences, and the only exception [7], addresses only the generation of binary prefer-one sequences.

The main contribution of this paper is an efficient shift-rule function for the $(n,k)$ prefer-max De Bruijn sequence, $k\geqslant 2$ (the case of $k=1$ is trivial). The shift rule, given in Algorithm 1, is based on the decomposition of the prefer-max sequence found by [9], and operates in $O(n)$ time and memory. This closes a gap in the literature, since while efficient constructions for the entire prefer-max sequence are known, an efficient shift rule is only known in the binary case.

The paper is organized as follows. In Section 2 we provide the necessary notation used throughout the paper, and recall some relevant results. In Section 3 we provide a mathematical expression for the shift rule. We proceed in Section 4 to devise an efficient algorithm that implements the shift rule. We conclude in Section 5 with a short discussion of the results.

2 Preliminaries

Let us start by reviewing the necessary definitions and previous results, before presenting the new results. To avoid trivialities, we assume throughout the paper that $n,k\geqslant 2$ . With our alphabet letters $[k]$ we associate a lexicographical order, $0<1<\dots<k-1$ . This order is extended in the natural way to all finite words from $[k]^{*}$ by defining $x<y$ if either $x$ is a prefix of $y$ , or there exist (possibly empty) $u,v,w\in[k]^{*}$ and two letters $\sigma,\sigma^{\prime}\in[k]$ , $\sigma<\sigma^{\prime}$ , such that $x=u\sigma v$ and $y=u\sigma^{\prime}w$ .

Given a word $v\triangleq\sigma_{0}\sigma_{1}\dots\sigma_{n-1}$ , with $\sigma_{i}\in[k]$ ,we define the rotation operator, $\mathcal{R}\colon[k]^{n}\to[k]^{n}$ as $\mathcal{R}(v)\triangleq\sigma_{1}\sigma_{2}\dots\sigma_{n-1}\sigma_{0}$ . We say that two words $v,u\in[k]^{n}$ are cyclically equivalent if there exists $i\in{\mathbb{Z}}$ such that $v=\mathcal{R}^{i}(u)$ . The equivalence classes under $\mathcal{R}$ are called necklaces. The number of necklaces, denoted by $Z_{k}(n)$ , is known to be $Z_{k}(n)\triangleq\frac{1}{n}\sum_{d|n}\phi(d)k^{n/d}$ , where $\phi$ is Euler’s totient function (also the number of cycles in the pure cycling FSR, see [13]). Let $v\in[k]^{n}$ be a word. The cyclic order of $v$ , denoted $o(v)$ , is the smallest positive integer $o(v)\in{\mathbb{N}}$ such that $\mathcal{R}^{o(v)}(v)=v$ or, alternatively, it is the number of elements in the necklace containing $v$ . If $o(v)=\left|{v}\right|$ we say that $v$ is primitive. For any $v\in[k]^{n}$ there is a unique primitive word $w\in[k]^{o(v)}$ such that $v=w^{n/o(v)}$ .

A primitive word that is lexicographically least in its necklace is called a Lyndon word. If $L\in[k]^{+}$ is a Lyndon word, we shall also find it useful to define $L^{m}$ as an expanded Lyndon word222In some places, by abuse of notation, a lexicographically least representative of a necklace (which coincides with our definition of an expanded Lyndon word) is also called a necklace. We shall not do the same since we shall later require a different representative of a necklace, which might cause a confusion., for all $m\in{\mathbb{N}}$ . Additionally, we can arrange all the expanded Lyndon words of length $n$ in increasing lexicographical order

[TABLE]

where $r_{i}\triangleq n/\left|{L_{i}}\right|$ . The main result of [8, 9] (rephrased to simplify the presentation) is that the prefer-min $(n,k)$ -DB sequence is in fact the concatenation $L_{0}L_{1}\cdots L_{Z_{k}(n)-1}$ . We shall use this fact later on, and call it the FKM factorization. We also comment that by complementing all the letters via $\psi\colon[k]\to[k]$ , $\psi(i)\triangleq k-1-i$ , for all $i\in[k]$ , the prefer-min $(n,k)$ -DB sequence becomes the prefer-max $(n,k)$ -DB sequence, and vice versa. We extend $\psi$ to operate on words in the natural way, i.e., applying it to all letters of the word.

Example 1.

Fix $n=2$ and $k=3$ . We then have the following lexicographical order of expanded Lyndon words,

[TABLE]

hence

[TABLE]

and indeed the prefer-min $(2,3)$ -DB sequence is $L_{0}L_{1}L_{2}L_{3}L_{4}L_{5}=001021122$ . After complementing each letter we obtain $\psi(L_{0}L_{1}L_{2}L_{3}L_{4}L_{5})=221201100$ , which is the prefer-max $(2,3)$ -DB sequence.

3 Shift-Rule Construction

In this section we state and prove our shift-rule construction. For ease of presentation, we work with the prefer-min sequence, while remembering that by simply complementing letters with $\psi$ , this is equivalent to working with the prefer-max sequence.

We first require a definition that distinguishes another necklace member that is not necessarily the expanded Lyndon word $L_{i}^{r_{i}}$ defined above.

Definition 2.

A word $v\in[k]^{n}$ is a necklace head, tested by the predicate $\mathsf{head}(v)$ , if we can write $v$ as $v=(k\!-\!1)^{t}w\sigma$ , where $w\in[k]^{n-t-1}$ , $\sigma\in[k-1]$ (i.e., $\sigma\neq k-1$ ), and $\mathcal{R}^{t}(v)=w\sigma(k\!-\!1)^{t}$ is an expanded Lyndon word.

We briefly note that the necklace containing the single word $(k\!-\!1)^{n}$ does not formally have a necklace head, whereas all other necklaces have a unique necklace head. Additionally, by the above definition, if $(k\!-\!1)^{t}w\sigma$ is a necklace head, either $w=\varepsilon$ is empty, or it does not start with the letter $k-1$ .

We now define our shift rule. Traditionally, a shift rule is a function that takes $n$ consecutive letters in the sequence (i.e., the current state of an FSR generating the sequence) and its output is the next letter. However, we will find it more convenient to define a shift rule as providing the next state of the FSR. Specifically, let $\left\langle{\nu_{i}}\right\rangle_{i=0}^{k^{n}-1}$ be the prefer-min $(n,k)$ -DB sequence. A shift rule for the sequence is a function $f\colon[k]^{n}\to[k]^{n}$ such that $f(\nu_{i}\nu_{i+1}\dots\nu_{i+n-1})=\nu_{i+1}\nu_{i+2}\dots\nu_{i+n}$ , for all $i\in{\mathbb{Z}}_{k^{n}}$ .

Definition 3.

Let $\mathsf{next}\colon[k]^{n}\to[k]^{n}$ be defined by

[TABLE]

where $\sigma\in[k]$ and $w\in[k]^{n-1}$ .

The main result of this section is the following theorem.

Theorem 4.

$\mathsf{next}$ * is a shift rule for the prefer-min $(n,k)$ -DB sequence.*

Before proceeding, we provide an example.

Example 5.

Continuing our running example from Example 1, consider again the prefer-min $(2,3)$ -DB sequence $001021122$ . Take as an example the subword $\sigma w=21$ , i.e., $\sigma=2$ and $w=1$ . In this case $\mathsf{next}(21)$ is computed using the second case of Definition 3, and $S=\left\{{1}\right\}$ since $\mathsf{head}(11)$ is true but $\mathsf{head}(10)$ is false. Thus, $\mathsf{next}(21)=11$ , which is consistent with the sequence.

In order to prove Theorem 4 we state a sequence of lemmas, building up to the main result.

Lemma 6.

A word $v\in[k]^{+}$ is an expanded Lyndon word if and only if $v\leqslant\mathcal{R}^{i}(v)$ for all $i\in{\mathbb{Z}}$ (i.e., it is lexicographically least in its necklace).

Proof.

Consider the (unique) decomposition $v=w^{t}$ , where $w$ is primitive. Note that $\mathcal{R}^{i}(v)=(\mathcal{R}^{i}(w))^{t}$ . Thus, $v\leqslant\mathcal{R}^{i}(v)$ if and only if $w\leqslant\mathcal{R}^{i}(w)$ , which holds for all $i\in{\mathbb{Z}}$ if and only if $w$ is a Lyndon word. ∎

A first step we take is showing that increasing the rightmost letter that is not $k-1$ in an expanded Lyndon word maintains the expanded Lyndon property.

Lemma 7.

If $w\sigma(k\!-\!1)^{t}\in[k]^{n}$ is an expanded Lyndon word and $\sigma\in[k-1]$ then $w(\sigma\!+\!1)(k\!-\!1)^{t}$ is also an expanded Lyndon word.

Proof.

If $w(\sigma\!+\!1)(k\!-\!1)^{t}$ starts with $k-1$ then it is equal to $(k\!-\!1)^{n}$ and the claim follows. Otherwise, write $w(\sigma\!+\!1)(k\!-\!1)^{t}=xy$ and we shall prove that $xy\leqslant yx$ . If $\left|{y}\right|\leqslant t$ the claim trivially holds. Otherwise, for some word $v$ , $y=v(\sigma\!+\!1)(k\!-\!1)^{t}$ and $w=xv$ . By assumption, $xv\sigma(k\!-\!1)^{t}\leqslant v\sigma(k\!-\!1)^{t}x$ . Since $\left|{v}\right|\leqslant\left|{xv}\right|$ , $xv(\sigma\!+\!1)(k\!-\!1)^{t}\leqslant v(\sigma\!+\!1)(k\!-\!1)^{t}x$ as well. ∎

We now turn, in the following lemmas, to consider connections between successive expanded Lyndon words, $L_{i}^{r_{i}}$ and $L_{i+1}^{r_{i+1}}$ .

Lemma 8.

Let $L_{i}^{r_{i}}=w\sigma(k\!-\!1)^{t}\in[k]^{n}$ be the $i$ th expanded Lyndon word in increasing lexicographical order where $\sigma\neq k-1$ . Then, the $(i+1)$ th expanded Lyndon word, $L_{i+1}^{r_{i+1}}$ , is $w(\sigma\!+\!1)x$ where $x\in[k]^{t}$ is the lexicographically smallest word for which $w(\sigma\!+\!1)x$ is an expanded Lyndon word.

Proof.

By Lemma 7, $w(\sigma\!+\!1)(k\!-\!1)^{t}$ is an expanded Lyndon word, i.e., $w(\sigma\!+\!1)(k\!-\!1)^{t}=L_{j}^{r_{j}}$ for some $j>i$ . It then follows that $L_{i+1}^{r_{i+1}}$ must be of the form $w(\sigma\!+\!1)x$ as claimed. ∎

The following lemma combines the shift rule function, $\mathsf{next}$ , and the lexicographical order of expanded Lyndon words. We use the notation $\mathsf{next}^{j}(\cdot)$ , $j\in{\mathbb{N}}$ , to denote the composition of $\mathsf{next}$ with itself $j$ times.

Lemma 9.

$\mathsf{next}^{\left|{L_{i}}\right|}(L_{i}^{r_{i}})=L_{i+1}^{r_{i+1}}$ , for all $i\in[Z_{k}(n)-2]$ .

Proof.

Since $i\in[Z_{k}(n)-2]$ , $L_{i}^{r_{i}}$ is not the lexicographically last expanded Lyndon word and not the one before it, i.e.,

[TABLE]

We can therefore write $L_{i}=w\sigma(k\!-\!1)^{t}$ , $\sigma\in[k-1]$ , so $L_{i}^{r_{i}}=w\sigma(k\!-\!1)^{t}L_{i}^{r_{i}-1}$ . Using these notations

[TABLE]

Our proof proceeds by establishing the following three facts:

$\mathsf{next}^{\left|{w}\right|}(w\sigma(k\!-\!1)^{t}L_{i}^{r_{i}-1})=\sigma(k\!-\!1)^{t}L_{i}^{r_{i}-1}w$ 2. 2.

$\mathsf{next}(\sigma(k\!-\!1)^{t}L_{i}^{r_{i}-1}w)=(k\!-\!1)^{t}L_{i}^{r_{i}-1}w(\sigma\!+\!1)$ 3. 3.

$\mathsf{next}^{t}((k\!-\!1)^{t}L_{i}^{r_{i}-1}w(\sigma\!+\!1))=L_{i}^{r_{i}-1}w(\sigma\!+\!1)x$ , such that $x\in[k]^{t}$ is the lexicographically smallest word for which $L_{i}^{r_{i}-1}w(\sigma\!+\!1)x$ is an expanded Lyndon word.

Combining the three facts together, we get that $\mathsf{next}^{\left|{L_{i}}\right|}(L_{i}^{r_{i}})=L_{i}^{r_{i}-1}w(\sigma\!+\!1)x$ , and by Lemma 8, we get the desired.

We first prove step (1). We contend that this step’s claim holds since in the first $\left|{w}\right|$ applications of $\mathsf{next}$ only the third case of the definition of $\mathsf{next}$ (cf. Definition 3) takes place. To prove this, we need to show that for any decomposition $w=w_{1}\tau w_{2}$ , $\tau\in[k]$ , $w_{1},w_{2}\in[k]^{*}$ , we have

[TABLE]

Hence, we need to show that $w_{2}\sigma(k\!-\!1)^{t}L_{i}^{r_{i}-1}w_{1}\tau$ is not a necklace head, and that if $\tau=k-1$ then there is no $\sigma^{\prime}\in[k-1]$ such that $w_{2}\sigma(k\!-\!1)^{t}L_{i}^{r_{i}-1}w_{1}\sigma^{\prime}$ is a necklace head.

For the first condition, assume to the contrary that the predicate $\mathsf{head}(w_{2}\sigma(k\!-\!1)^{t}L_{i}^{r_{i}-1}w_{1}\tau)$ is true. By definition, there should exist an integer $0\leqslant m\leqslant\left|{w_{2}}\right|$ such that $w_{2}=(k\!-\!1)^{m}w_{3}$ and

[TABLE]

is an expanded Lyndon word. However, we note that

[TABLE]

Since $0<\left|{w_{1}}\right|+1+m<\left|{L_{i}}\right|$ , this contradicts the cyclic order of $L_{i}^{r_{i}}$ .

As for the second condition, where $\tau=k-1$ , assume to the contrary that for some $\sigma^{\prime}\in[k-1]$ , the word $w_{2}\sigma(k\!-\!1)^{t}L_{i}^{r_{i}-1}w_{1}\sigma^{\prime}$ is a necklace head. Again, there should exist an integer $0\leqslant m\leqslant\left|{w_{2}}\right|$ , such that $w_{2}=(k\!-\!1)^{m}w_{3}$ , and

[TABLE]

is an expanded Lyndon word. Thus, on the right-hand side of (2), the rightmost letter that is not $k-1$ , is $\sigma^{\prime}$ . By repeated applications of Lemma 7, we get that we can replace $\sigma^{\prime}$ by $k-1$ and still have an expanded Lyndon word, i.e.,

[TABLE]

is an expanded Lyndon word. As in the previous case, this contradicts the cyclic order of $L_{i}$ .

The proof of step (2) is simpler. We need to show that we fall under the first case in the definition of $\mathsf{next}$ (cf. Definition 3), i.e., that $(k\!-\!1)^{t}L_{i}^{r_{i}-1}w\sigma$ is a necklace head. That is indeed true since

[TABLE]

is an expanded Lyndon word.

Finally, we address step (3), where we need to prove that $\mathsf{next}^{t}((k\!-\!1)^{t}L_{i}^{r_{i}-1}w(\sigma\!+\!1))=L_{i}^{r_{i}-1}w(\sigma\!+\!1)x$ , such that $x\in[k]^{t}$ is the lexicographically smallest word for which $L_{i}^{r_{i}-1}w(\sigma\!+\!1)x$ is an expanded Lyndon word. Note that by (1), $(k\!-\!1)^{t}L_{i}^{r_{i}-1}w(\sigma\!+\!1)\neq(k\!-\!1)^{n}$ , so by Lemma 8 such an $x$ exists. Additionally, for any $0\leqslant i<t$ we have that $\mathsf{next}^{i}((k\!-\!1)^{t}L_{i}^{r_{i}-1}{w}(\sigma\!+\!1))=(k\!-\!1)w^{\prime}$ , thus we never fall within the first case of $\mathsf{next}$ .

Next, we show that for any $0\leqslant i<t$ , $j=t-i-1$ , $x=x_{1}\tau x_{2}$ , $x_{1}\in[k]^{i}$ , $\tau\in[k]$ , we have that

[TABLE]

We distinguish between two cases depending on $\tau$ . For the first case, let $\tau=k-1$ . We contend that we do not fall within the second case of $\mathsf{next}$ . Assume to the contrary that there is some $\sigma^{\prime}\in[k-1]$ such that $(k\!-\!1)^{j}L_{i}^{r_{i}-1}w(\sigma\!+\!1)x_{1}\sigma^{\prime}$ is a necklace head. Thus, $L_{i}^{r_{i}-1}w(\sigma\!+\!1)x_{1}\sigma^{\prime}(k\!-\!1)^{j}$ is an expanded Lyndon word. Looking at its suffix of length $t$ , we get

[TABLE]

which is a contradiction to the minimality of $x$ .

Now, for the case where $\tau\in[k-1]$ . By the definition of $x$ we know that $L_{i}^{r_{i}-1}w(\sigma\!+\!1)x=L_{i}^{r_{i}-1}w(\sigma\!+\!1)x_{1}\tau x_{2}$ is an expanded Lyndon word. Using Lemma 7, we get that $L_{i}^{r_{i}-1}w(\sigma\!+\!1)x_{1}\tau(k\!-\!1)^{j}$ is also an expanded Lyndon word. Therefore, $(k\!-\!1)^{j}L_{i}^{r_{i}-1}w(\sigma\!+\!1)x_{1}\tau$ is a necklace head. Left to be shown is that

[TABLE]

Assuming to the contrary that $\tau_{\min}<\tau$ , then $(k\!-\!1)^{j}L_{i}^{r_{i}-1}w(\sigma\!+\!1)x_{1}\tau_{\min}$ is a necklace head, implying that $L_{i}^{r_{i}-1}w(\sigma\!+\!1)x_{1}\tau_{\min}(k\!-\!1)^{j}$ is an expanded Lyndon word. As in the previous case, since

[TABLE]

we get a contradiction to the minimality of $x$ . ∎

Lemma 9 does not apply to the penultimate expanded Lyndon word, for which, by simple inspection of the definition of $\mathsf{next}$ we state

[TABLE]

We are now in a position to prove the main result.

Proof of Theorem 4.

As a first technical step it is easy to verify that $\mathsf{next}$ is a shift rule generating some sequence, since indeed for every $\sigma w$ , $\sigma\in[k]$ , $w\in[k]^{n-1}$ , we have $\mathsf{next}(\sigma w)=w\sigma^{\prime}$ for some $\sigma^{\prime}\in[k]$ .

In the next step, let us examine an unknown sequence $\left\langle{\nu_{i}}\right\rangle_{i=0}^{k^{n}-1}$ , that is initialized with $\nu_{0}\dots\nu_{n-1}=0^{n}$ , and whose following letters are generated using $\mathsf{next}$ . We define the numbers $s_{i}\triangleq\sum_{j=0}^{i-1}\left|{L_{i}}\right|$ , for all $0\leqslant i\leqslant Z_{k}(n)-1$ . We prove by induction that for all $i\in[Z_{k}(n)-1]$ , $\nu_{s_{i}}\nu_{s_{i}+1}\dots\nu_{s_{i}+n-1}=L_{i}^{r_{i}}$ . The proof is immediate since the induction base is our initialization of $\nu_{0}\dots\nu_{n-1}=0^{n}=L_{0}^{r_{0}}$ , and the induction step is provided by Lemma 9, since

[TABLE]

By this induction, we already have the prefix of the generated sequence to be $L_{0}L_{1}\dots L_{Z_{k}(n)-2}$ , but we are missing the last Lyndon word, $L_{Z_{k}(n)-1}=k-1$ . This is easily taken care of, since by (3),

[TABLE]

namely, the last letter is the last Lyndon word,

[TABLE]

We also observe that the shift rule wraps around the end of the sequence. Indeed, by a simple inspection of Definition 3, for every $1\leqslant i\leqslant n$ ,

[TABLE]

As the final step in the proof, by FKM [8, 9] this sequence is exactly the prefer-min $(n,k)$ -DB sequence. ∎

We conclude this section by reminding the reader that in order to generate the prefer-max $(n,k)$ -DB sequence (instead of the prefer-min one), all that is required is to start the FSR with $(k\!-\!1)^{n}$ , and to use the shift rule $\psi^{-1}\circ\mathsf{next}\circ\psi$ , where $\psi$ is the complement function defined in Section 2, and $\circ$ denotes function composition.

4 Efficient Shift-Rule Algorithm

Algorithms for implementing shift-rules for the prefer-min (or prefer-max) $(n,k)$ -DB sequences are known [10, 6]. These greedy algorithms require $\Theta(k^{n})$ memory, and $\Theta(nk^{n})$ time in the worst case (since they in fact need to generate the sequence until the position of the desired next letter). The main result of this section is an efficient algorithm, requiring $O(n)$ time and memory, that implements the shift rule we presented in the previous section. By quick inspection, the claim hinges on an efficient implementation of the $\mathsf{head}$ predicate, as well as finding $\min S$ in the second case of $\mathsf{next}$ .

Our algorithm uses two key components. The first, is the renowned factorization due to Chen, Fox, and Lyndon [2], namely, that every word $w\in[k]^{+}$ has a unique decomposition $w=w_{0}w_{1}\dots w_{m-1}$ , such that $w_{i}$ is a Lyndon word for all $0\leqslant i\leqslant m-1$ , and $w_{0}\geqslant w_{1}\geqslant\dots\geqslant w_{m-1}$ . We shall call this the CFL factorization of $w$ . The second key component is due to Duval [4], who showed that this unique decomposition may be computed for all $w\in[k]^{+}$ in $O(\left|{w}\right|)$ time and memory.

First, we address the efficiency of computing the predicate $\mathsf{head}$ .

Lemma 10.

For any $w\in[k]^{n}$ it is possible to compute $\mathsf{head}(w)$ in $O(n)$ time and memory.

Proof.

Let $i\in{\mathbb{Z}}$ be the largest integer such that $(k\!-\!1)^{i}$ is a prefix of $w$ . We apply Duval’s algorithm [4] to $\mathcal{R}^{i}(w)$ to obtain its CFL factorization $\mathcal{R}^{i}(w)=w_{0}w_{1}\dots w_{m-1}$ . Then $\mathsf{head}(w)$ is true if and only if $w_{0}=w_{m-1}$ . ∎

Next we recall some useful results already known in the literature. A word $w\in\Sigma^{*}$ is called a pre-necklace if there exists $w^{\prime}\in\Sigma^{*}$ such that $ww^{\prime}$ is an expanded Lyndon word. By [1, Lemma 2.3], a pre-necklace must necessarily be a fractional power of a Lyndon word, i.e., $w=u^{m}v$ , with $u$ being a Lyndon word, $m\geqslant 1$ , and $v$ a proper prefix of $u$ . Since the $u^{m}$ part is a prefix of a CFL decomposition for $w$ , this decomposition is unique and it is efficiently computable in $O(\left|{w}\right|)$ time and memory. Finally, we recall [1, Theorem 2.1], whose authors dubbed the “fundamental theorem of necklaces”.

Theorem 11 (Theorem 2.1 of [1]).

Let $w=\tau_{0}\tau_{1}\dots\tau_{n-1}$ , with $\tau_{i}\in[k]$ , be a pre-necklace with fractional-power decomposition $w=u^{m}v$ . Then, $w\sigma$ , $\sigma\in[k]$ , is a pre-necklace if and only if $\sigma\geqslant\tau_{\left|{v}\right|}$ . Furthermore, $w\sigma$ is a Lyndon word if and only if $\sigma>\tau_{\left|{v}\right|}$ .

We are now in a position to describe the algorithm for $\mathsf{next}(\sigma w)$ and prove its correctness.

Theorem 12.

Algorithm 1 correctly computes the shift rule $\mathsf{next}$ from Definition 3 in $O(n)$ time and memory.

Proof.

We argue that Algorithm 1 computes the function $\mathsf{next}$ . We consider the three cases of Definition 3 separately. First, if $\sigma\in[k-1]$ and $\mathsf{head}(w\sigma)$ , the algorithm returns $w(\sigma\!+\!1)$ in line 2 as required by the first case of Definition 3.

Now, assume the input $\sigma w$ falls within the third case of Definition 3. If $\sigma<k-1$ the claim is obvious as the condition in line 1 does not hold. If $\sigma=k-1$ , then we have $S\triangleq\left\{{\sigma^{\prime\prime}\in[k-1]\colon\mathsf{head}(w\sigma^{\prime\prime})}\right\}$ . By Lemma 7, $S\neq\emptyset$ if and only if $\mathsf{head}(w(k\!-\!2))$ holds. Thus, line 5 correctly checks whether the second case of Definition 3 applies. We therefore reach line 15 exactly when the third case of Definition 3 applies, and correctly return $w\sigma$ .

We are left with the second case of Definition 3, where $\sigma=k-1$ and $S\neq\emptyset$ . First, the special case of $\sigma w=(k\!-\!1)^{n}$ , is handled correctly in line 4. Otherwise, $w$ contains some letter other than $k-1$ , and $w^{\prime}$ is well defined.

We now contend that $\min S\in\left\{{\sigma^{\prime},\sigma^{\prime}+1}\right\}$ . Since $\mathsf{head}(w(k\!-\!2))$ holds, then $w^{\prime}(k\!-\!2)(k\!-\!1)^{t}$ is an expanded Lyndon word, hence $w^{\prime}$ is a pre-necklace. Also, note that if $\sigma^{\prime\prime}\in S$ then $w^{\prime}\sigma^{\prime\prime}$ is a pre-necklace. By Theorem 11, if $\sigma^{\prime\prime}<\sigma^{\prime}$ then $w^{\prime}\sigma^{\prime\prime}$ is not a pre-necklace. Hence, $\min S\geqslant\sigma^{\prime}$ . However, also by Theorem 11, $w(\sigma^{\prime}\!+\!1)$ is a Lyndon word, thus $\sigma^{\prime}+1\in S$ and $\min S\leqslant\sigma^{\prime}+1$ . This leaves only two possible values for $\min S$ , and consequently, the algorithm terminates in line 10 or in line 12, and returns the desired word.

Finally, as already noted, CFL factorization, $\mathsf{head}$ , as well as the fractional-power decomposition of line 7, may be computed in linear time and memory (all relying on the CFL factorization algorithm). Thus, the entire algorithm takes linear time and memory. ∎

5 Discussion

In this paper we studied the well known prefer-min and prefer-max $(n,k)$ -DB sequences. We completed a gap in the literature by presenting a shift-rule for the sequences, as well as an efficient algorithm computing this shift rule. The algorithm receives as input a sub-sequence of $n$ letters, and determines the next letter in $O(n)$ time and memory.

The shift rule we presented may be seen as an extension to the binary shift rule presented in [7]. Indeed, if we set $k=2$ in our algorithm, the second case of Definition 3 becomes degenerate, we are left with the algorithm of [7]. This also explains the main difficulty in our solution, which is finding $\min S$ efficiently. The crux of solving this difficulty is the proof that we only need to choose between two carefully chosen values.

Bibliography13

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] K. Cattell, F. Ruskey, J. Sawada, M. Serra, Fast algorithms to generate necklaces, unlabeled necklaces, and irreducible polynomials over G F ( 2 ) 𝐺 𝐹 2 GF(2) , J. of Algorithms 37 (2000) 267–282.
2[2] K. T. Chen, R. H. Fox, R. C. Lyndon, Free differential calculus, IV, Annals of Math. 68 (1958) 81–95.
3[3] P. B. Dragon, O. I. Hernandez, J. Sawada, A. Williams, D. Wong, Constructing de Bruijn sequences with co-lexicographic order: The k 𝑘 k -ary Grandmama sequence, European J. of Combin. 72 (2018) 1–11.
4[4] J. P. Duval, Factorizing words over an ordered alphabet, J. of Algorithms 4 (1983) 363–381.
5[5] T. Etzion, An algorithm for constructing m 𝑚 m -ary de Bruijn sequences, J. of Algorithms 7 (3) (1986) 331–340.
6[6] L. R. Ford, A cyclic arrangement of m 𝑚 m -tuples, Tech. Rep. P-1071, RAND Corp. (1957).
7[7] H. M. Fredricksen, Generation of the Ford sequence of length 2 n superscript 2 𝑛 2^{n} , n 𝑛 n large, J. Combin. Theory 12 (1972) 153–154.
8[8] H. M. Fredricksen, I. J. Kessler, Lexicographic compositions and de Bruijn sequences, J. Combin. Theory 22 (1977) 17–30.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

An Efficient Shift Rule for the

Abstract

keywords:

MSC:

1 Introduction

2 Preliminaries

Example 1**.**

3 Shift-Rule Construction

Definition 2**.**

Definition 3**.**

Theorem 4**.**

Example 5**.**

Lemma 6**.**

Proof.

Lemma 7**.**

Proof.

Lemma 8**.**

Proof.

Lemma 9**.**

Proof.

Proof of Theorem 4.

4 Efficient Shift-Rule Algorithm

Lemma 10**.**

Proof.

Theorem 11** (Theorem 2.1 of [1]).**

Theorem 12**.**

Proof.

5 Discussion

Example 1.

Definition 2.

Definition 3.

Theorem 4.

Example 5.

Lemma 6.

Lemma 7.

Lemma 8.

Lemma 9.

Lemma 10.

Theorem 11 (Theorem 2.1 of [1]).

Theorem 12.