Universal parameterized family of distributions of runs

Hayato Takahashi

arXiv:2302.14356·math.PR·December 18, 2025

Universal parameterized family of distributions of runs

Hayato Takahashi

PDF

Open Access

TL;DR

This paper derives explicit formulas for probabilities related to runs and nonoverlapping words in i.i.d. finite-valued sequences, generalizing previous results and analyzing computational complexity.

Contribution

It introduces a unified explicit formula for run probabilities in i.i.d. sequences, extending -overlapping probabilities and analyzing computational efficiency.

Findings

01

Explicit formulas for run probabilities in i.i.d. sequences

02

Linear computational complexity for fixed parameters

03

Asymptotic analysis of integer partitions

Abstract

We present explicit formulae for parameterized families of probabilities of the number of nonoverlapping words and increasing nonoverlapping words in independent and identically distributed (i.i.d.) finite valued random variables, respectively. Then we provide an explicit formula for a parameterized family of probabilities of the number of runs, which generalizes $\mu$-overlapping probabilities for $\mu\geq 0$ in i.i.d.~binary valued random variables. We also demonstrate exact probabilities of the number of runs whose size are exactly given numbers (Mood 1940). The number of arithmetic operations required to compute our formula for generalized probabilities of runs is linear order of sample size for fixed number of parameters and range. To analyse these number of arithmetic operations for unbounded number of parameters, we show an asymptotic formula for the number of integer…

Tables1

Table 1. Table 1: Distance of distributions

$d$	1	3	5	7	9
$dist (d, 995 \| 40)$	0.117859	0.0168652	0.0036909	0.0009005	0.0002248

Equations152

N (w_{1}, \dots, w_{h}; x_{1}^{n}) := (i = 1 \sum n - ∣ w_{1} ∣ + 1 I_{w_{1}} (x_{i}^{n}), \dots, i = 1 \sum n - ∣ w_{h} ∣ + 1 I_{w_{h}} (x_{i}^{n})),

N (w_{1}, \dots, w_{h}; x_{1}^{n}) := (i = 1 \sum n - ∣ w_{1} ∣ + 1 I_{w_{1}} (x_{i}^{n}), \dots, i = 1 \sum n - ∣ w_{h} ∣ + 1 I_{w_{h}} (x_{i}^{n})),

(a _{1} , \dots , a _{l} n) = \frac{n !}{a _{1} ! \dots a _{l} ! ( n - \sum a _{i} )!},

(a _{1} , \dots , a _{l} n) = \frac{n !}{a _{1} ! \dots a _{l} ! ( n - \sum a _{i} )!},

A (k_{1}, \dots, k_{h}) = (k _{1} , \dots , k _{h} n - \sum _{i} ∣ w _{i} ∣ k _{i} + \sum _{i} k _{i}) i = 1 \prod h P^{k_{i}} (w_{i}),

A (k_{1}, \dots, k_{h}) = (k _{1} , \dots , k _{h} n - \sum _{i} ∣ w _{i} ∣ k _{i} + \sum _{i} k _{i}) i = 1 \prod h P^{k_{i}} (w_{i}),

B (k_{1}, \dots, k_{h}) = P (N (w_{1}, \dots, w_{h}; X_{1}^{n}) = (k_{1}, \dots, k_{h})),

F_{A} (z_{1}, \dots, z_{h}) = k_{1}, \dots, k_{h} \sum A (k_{1}, \dots, k_{h}) z^{k_{1}} \dots z^{k_{h}}, and

F_{B} (z_{1}, \dots, z_{h}) = k_{1}, \dots, k_{h} \sum B (k_{1}, \dots, k_{h}) z^{k_{1}} \dots z^{k_{h}} .

A (k_{1}, \dots, k_{h}) = \sum B (t_{1}, \dots, t_{h}) (k _{1} t _{1}) \dots (k _{h} t _{h}),

A (k_{1}, \dots, k_{h}) = \sum B (t_{1}, \dots, t_{h}) (k _{1} t _{1}) \dots (k _{h} t _{h}),

F_{A} (z_{1}, z_{2}, \dots, z_{h}) = F_{B} (z_{1} + 1, z_{2} + 1, \dots, z_{h} + 1), and

P (N (w_{1}, \dots, w_{h}; X_{1}^{n}) = (s_{1}, \dots, s_{h}))

= k_{1}, \dots, k_{h} : s_{1} \leq k_{1}, \dots, s_{h} \leq k_{h} \sum_{i} ∣ w_{i} ∣ k_{i} \leq n \sum (- 1)^{\sum_{i} k_{i} - s_{i}} (s _{1} , \dots , s _{h} , k _{1} - s _{1} , \dots k _{h} - s _{h} n - \sum _{i} ∣ w _{i} ∣ k _{i} + \sum _{i} k _{i}) i = 1 \prod h P^{k_{i}} (w_{i}) .

(k _{1} , k _{2} n - ∣ w _{1} ∣ k _{1} - ∣ w _{2} ∣ k _{2} + k _{1} + k _{2}) .

(k _{1} , k _{2} n - ∣ w _{1} ∣ k _{1} - ∣ w _{2} ∣ k _{2} + k _{1} + k _{2}) .

A (k_{1}, k_{2}) : = (k _{1} , k _{2} n - ∣ w _{1} ∣ k _{1} - ∣ w _{2} ∣ k _{2} + k _{1} + k _{2}) P^{k_{1}} (w_{1}) P^{k_{2}} (w_{2}) .

A (k_{1}, k_{2}) : = (k _{1} , k _{2} n - ∣ w _{1} ∣ k _{1} - ∣ w _{2} ∣ k _{2} + k _{1} + k _{2}) P^{k_{1}} (w_{1}) P^{k_{2}} (w_{2}) .

A (k_{1}, k_{2}) = k_{1} \leq t_{1}, k_{2} \leq t_{2} \sum B (t_{1}, t_{2}) (k _{1} t _{1}) (k _{2} t _{2}) .

A (k_{1}, k_{2}) = k_{1} \leq t_{1}, k_{2} \leq t_{2} \sum B (t_{1}, t_{2}) (k _{1} t _{1}) (k _{2} t _{2}) .

F_{A} (z_{1}, z_{2})

F_{A} (z_{1}, z_{2})

= t_{1}, t_{2} \sum B (t_{1}, t_{2}) k_{1} \leq t_{1}, k_{2} \leq t_{2} \sum (k _{1} t _{1}) (k _{2} t _{2}) z_{1}^{k_{1}} z_{2}^{k_{2}}

= t_{1}, t_{2} \sum B (t_{1}, t_{2}) (z_{1} + 1)^{t_{1}} (z_{2} + 1)^{t_{2}}

= F_{B} (z_{1} + 1, z_{2} + 1) .

F_{B} (z_{1}, z_{2}) = F_{A} (z_{1} - 1, z_{2} - 1)

F_{B} (z_{1}, z_{2}) = F_{A} (z_{1} - 1, z_{2} - 1)

= k_{1}, k_{2} : ∣ w_{1} ∣ k_{1} + ∣ w_{2} ∣ k_{2} \leq n \sum (k _{1} , k _{2} n - ∣ w _{1} ∣ k _{1} - ∣ w _{2} ∣ k _{2} + k _{1} + k _{2}) (z_{1} - 1)^{k_{1}} (z_{2} - 1)^{k_{2}} P^{k_{1}} (w_{1}) P^{k_{2}} (w_{2})

= k_{1}, k_{2}, t_{1}, t_{2} : ∣ w_{1} ∣ k_{1} + ∣ w_{2} ∣ k_{2} \leq n k_{1}, k_{2}, t_{1}, t_{2} t_{1} \leq k_{1}, t_{2} \leq k_{2} \sum (k _{1} , k _{2} n - ∣ w _{1} ∣ k _{1} - ∣ w _{2} ∣ k _{2} + k _{1} + k _{2}) (t _{1} k _{1}) (t _{2} k _{2}) z_{1}^{t_{1}} z_{2}^{t_{2}} (- 1)^{k_{1} + k_{2} - t_{1} - t_{2}}

k_{1}, k_{2}, t_{1}, t_{2} : ∣ w_{1} ∣ k_{1} + ∣ w_{2} ∣ k_{2} \leq n k_{1}, k_{2}, t_{1}, t_{2} t_{1} \leq k_{1}, t_{2} \leq k_{2} \sum \times P^{k_{1}} (w_{1}) P^{k_{2}} (w_{2})

= t_{1}, t_{2} \sum z_{1}^{t_{1}} z_{2}^{t_{2}} k_{1}, k_{2} : ∣ w_{1} ∣ k_{1} + ∣ w_{2} ∣ k_{2} \leq n t_{1} \leq k_{1}, t_{2} \leq k_{2} \sum (- 1)^{k_{1} + k_{2} - t_{1} - t_{2}} (t _{1} , t _{2} , k _{1} - t _{1} , k _{2} - t _{2} n - ∣ w _{1} ∣ k _{1} - ∣ w _{2} ∣ k _{2} + k _{1} + k _{2}) P^{k_{1}} (w_{1}) P^{k_{2}} (w_{2}) .

E (N^{t} (w; X^{n})) = s = 1 \sum m i n {⌊ \frac{n}{∣ w ∣} ⌋, t} A_{t, s} (s n - s ∣ w ∣ + s) P^{s} (w)

E (N^{t} (w; X^{n})) = s = 1 \sum m i n {⌊ \frac{n}{∣ w ∣} ⌋, t} A_{t, s} (s n - s ∣ w ∣ + s) P^{s} (w)

E (Y_{i} Y_{j}) = ⎩ ⎨ ⎧ P (w) P^{2} (w) 0 if i = j, if Y_{i} and Y_{j} are disjoint, else.

E (Y_{i} Y_{j}) = ⎩ ⎨ ⎧ P (w) P^{2} (w) 0 if i = j, if Y_{i} and Y_{j} are disjoint, else.

E (N^{t} (w; X^{n}))

E (N^{t} (w; X^{n}))

= E (n (1), \dots, n (t) \sum j = 1 \prod t Y_{j, n (j)}) .

N^{'} (w_{1}, \dots, w_{h}; x) : = (s_{1} - s_{2}, s_{2} - s_{3}, \dots, s_{h}) if N (w_{1}, \dots, w_{h}; x) = (s_{1}, \dots, s_{h}) .

N^{'} (w_{1}, \dots, w_{h}; x) : = (s_{1} - s_{2}, s_{2} - s_{3}, \dots, s_{h}) if N (w_{1}, \dots, w_{h}; x) = (s_{1}, \dots, s_{h}) .

C_{n, (w_{1}, \dots, w_{h})} (x) := t if \sum i k_{i} = t and N^{'} (w_{1}, \dots, w_{h}; x_{1}^{n}) = (k_{1}, k_{2}, \dots, k_{h}) for ∣ x ∣ = n,

C_{n, (w_{1}, \dots, w_{h})} (x) := t if \sum i k_{i} = t and N^{'} (w_{1}, \dots, w_{h}; x_{1}^{n}) = (k_{1}, k_{2}, \dots, k_{h}) for ∣ x ∣ = n,

A (k_{1}, \dots, k_{h}) := (k _{1} , \dots , k _{h} n - \sum _{i} ∣ w _{i} ∣ k _{i} + \sum _{i} k _{i}) i = 1 \prod h P^{k_{i}} (w_{i}),

A (k_{1}, \dots, k_{h}) := (k _{1} , \dots , k _{h} n - \sum _{i} ∣ w _{i} ∣ k _{i} + \sum _{i} k _{i}) i = 1 \prod h P^{k_{i}} (w_{i}),

B (k_{1}, \dots, k_{h}) := P (N^{'} (w_{1}, \dots, w_{h}; X_{1}^{n}) = (k_{1}, k_{2}, \dots, k_{h})),

F_{A} (z_{1}, \dots, k_{h}) := k_{1}, \dots, k_{h} : \sum_{i} ∣ w_{i} ∣ k_{i} \leq n \sum A (k_{1}, \dots, k_{h}) z^{k_{1}} \dots z^{k_{h}}, and

F_{B} (z_{1}, \dots, z_{h}) := k_{1}, \dots, k_{h} : \sum_{i} ∣ w_{i} ∣ k_{i} \leq n \sum B (k_{1}, \dots, k_{h}) z^{k_{1}} \dots z^{k_{h}} .

F_{A} (z_{1}, \dots, z_{h}) = F_{B} (z_{1} + 1, z_{1} + z_{2} + 1, \dots, i \sum z_{i} + 1) and

F_{A} (z_{1}, \dots, z_{h}) = F_{B} (z_{1} + 1, z_{1} + z_{2} + 1, \dots, i \sum z_{i} + 1) and

P (C_{n, (w_{1}, \dots, w_{h})} (X_{1}^{n}) = t) = r, k_{1}, \dots, k_{h} : \sum ∣ w_{i} ∣ k_{i} \leq n, r \leq \sum k_{i} t = \sum i k_{i} - r \sum (- 1)^{r} (k _{1} , \dots , k _{h} n - \sum ∣ w _{i} ∣ k _{i} + \sum k _{i}) (r \sum k _{i}) \prod P^{k_{i}} (w_{i}) .

A (k_{1}, k_{2}) =

A (k_{1}, k_{2}) =

F_{A} (z_{1}, z_{2}) =

F_{A} (z_{1}, z_{2}) =

=

=

=

=

F_{A} (X, X (X + 1), \dots, X (X + 1)^{h - 1}) = F_{B} (X + 1, (X + 1)^{2}, \dots, (X + 1)^{h}) .

F_{A} (X, X (X + 1), \dots, X (X + 1)^{h - 1}) = F_{B} (X + 1, (X + 1)^{2}, \dots, (X + 1)^{h}) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Advanced Clustering Algorithms Research · Bayesian Modeling and Causal Inference

Full text

Universal parameterized family of distributions of runs

111Parts of the paper have been presented MSJ2017, MSJ2023, ICIAM 2023, Takahashi (2023a, b)

Hayato Takahashi222Random Data Lab. Inc., Tokyo 1210062, Email: [email protected]

Abstract

We present explicit formulae for parameterized families of distributions of the number of nonoverlapping words and increasing nonverlapping words in independent and identically distributed (i.i.d.) finite valued random variables, respectively. Then we provide an explicit formula for a parameterized family of distributions of the number of runs, which generalizes $\mu$ -overlapping distributions for $\mu\geq 0$ in i.i.d. binary valued random variables. We also demonstrate that of runs whose size are exactly given numbers (Mood 1940). The number of arithmetic operations required to compute our formula for generalized distributions of runs for fixed number of parameters and fixed range is linear order of sample size.

**Keywords: exact distribution, scan, run, pattern, inclusion-exclusion principles

**Mathematics Subject Classification: 05A15, 62E15

1 Introduction

We study distributions of the number of words in finite valued i.i.d. random variables (distributions of words for short). The distributions of words play important role in statistics, DNA analysis, information theory, see Balakrishnan & Koutras (2002); Berthé & Rigo (2016); Feller (1970); Jacquet & Szpankowski (2015); Lothaire (2005); Mood (1940); Robin et al. (2005); Wald & Wolfowitz (1940); Waterman (1995), and Zehavi & Wolf (1988).

Generating functions of the distributions of words obtained by inductive relations of words on sample size are inevitably rational functions, see Bassino et al. (2010); Blom & Thorburn (1982); Chrysaphinou & Papastavridis (1988); Flajolet & Sedgewick (2009); Goulden & Jackson (1983); Guibas & Odlyzko (1981), and Régnier & Szpankowski (1998). Feller (1970), Jacquet & Szpankowski (2015), and Robin et al. (2005) obtain approximations and recurrence formulae for the distributions of words from rational generating functions. Uppuluri & Patil (1983) and Antzoulakos & Chadjiconstantindis (2001) obtain explicit formulae by expanding rational generating functions into power series. However, in general, expanding rational functions into power series is not immediate cf. Chapter 11 Section 4 pp. 275 Feller (1970).

A word that consists of the same letter is called a run. The number of runs depends on the counting manner. Let $0^{m}$ be the word that consists of $m$ zeros. For $x\in\{0,1\}^{n}$ , let

(i) $E_{n,m}(x)$ , the number of $0^{m}$ of size exactly $m$ in $x$ (Mood, 1940; Fu & Koutras, 1994),

(ii) $G_{n,m}(x)$ , the number of $0^{m}$ of size greater than or equal to $m$ in $x$ (Fu & Koutras, 1994; Antzoulakos & Chadjiconstantindis, 2001),

(iii) $N_{n,m}(x)$ , the number of nonoverlapping $0^{m}$ in $x$ (Godbole, 1990; Hirano, 1986; Muselli, 1996; Antzoulakos & Chadjiconstantindis, 2001; Fu & Koutras, 1994; Feller, 1970),

(iv) $M_{n,m}(x)$ , the number of overlapping $0^{m}$ in $x$ (Ling, 1988; Antzoulakos & Chadjiconstantindis, 2001; Koutras & Alexandrou, 1997; Fu & Koutras, 1994; Godbole, 1992),

(v) $L_{n}(x)$ , the size of the longest run of 0s in $x$ (Makri et al., 2007; Phillipou & Makri, 1986; Antzoulakos & Chadjiconstantindis, 2001; Fu & Koutras, 1994),

(vi) $T_{k}(x)$ , the stopping time $t$ such that $0^{k}$ first appear in $x=x_{1}\cdots x_{t}$ (Aki et al., 1984; Philippou et al., 1983; Uppuluri & Patil, 1983), and

(vii) $N_{n,m,\mu}(x)$ , the enumeration of $0^{m}$ such that we allow $\mu$ -letters overlapping with the previous $0^{m}$ in the string $x$ (Aki & Hirano, 2000; Han & Aki, 2000; Makri & Psillakis, 2015).

Fu & Koutras (1994) provides nonparametric exact distributions of runs by Markov imbedding method. Though obtaining parametric models for distributions of words is desirable (Stefanov & Pakes, 1997), as far as the author understand, no explicit formulae for parameterized families of distributions of nonoverlapping words, nonoverlapping increasing words, $E_{n,m}$ , and $N_{n,m,\mu}$ are known.

In this paper, we present explicit formulae for distributions of these statistics. To avoid the difficulty of enumerating overlapping words and expanding rational functions into power series, in Theorem 3.2, we study distributions of increasing nonoverlapping words and their finite dimensional generating functions. Combining Theorem 3.2 with a combinatorial lemma, in Theorem 3.6, we derive explicit formulae for parameterized distributions of runs including those of the statistics (i)–(vii) above by a unified manner for binary valued i.i.d. random variables. Generalization of our formulae in Theorem 3.6 to those for countable valued i.i.d. random variables are straightforward, see Remark 3.7.

The rest of the paper consists as follows. In Section 2 Theorem 2.1 and 2.2, we show explicit formulae for parameterized families of joint probabilities of nonoverlapping words and their moments for finite valued i.i.d. radnom variables. In Section 4, we study distance among the distributions of runs. In Section 5, we show algorithm and complexity to compute our formulae.

2 Joint distributions of nonoverlapping words

A finite string of a finite alphabet ${\mathcal{A}}$ is called a word. Let $|x|$ be the length of a word $x$ . The word $xy$ is the concatenation of two words $x$ and $y$ . The word $x^{k}$ is the $k$ -times concatenations of a word $x$ , e.g. $x^{2}=xx$ . A word $x$ is called overlapping if there is a word $z$ such that $x$ appears at least 2 times in $z$ and $|z|<2|x|$ ; otherwise $x$ is called nonoverlapping. A pair of words $(x,y)\in S^{2},x\neq y$ is called overlapping if there is a word $z$ such that $x$ and $y$ appear in $z$ and $|z|<|x|+|y|$ ; otherwise the pair is called nonoverlapping. A finite set of words $S$ is called nonoverlapping if every $x\in S$ and pair $(x,y)\in S^{2},x\neq y$ are nonoverlapping; otherwise $S$ is called overlapping. For example, $\{10\}\text{ and }\{00111,00101\}$ are nonoverlapping; $\{00\}$ and $\{10,01\}$ are overlapping.

In the following, let ${\mathbf{N}}(w_{1},\ldots,w_{h};x_{1}^{n})$ be the number of words $w_{1},\ldots,w_{h}$ in an arbitrary position of $x_{1}^{n}\in{\mathcal{A}}^{n}$ , i.e.

[TABLE]

where $x_{i}^{n}=x_{i}\cdots x_{n}$ and $I_{w_{j}}(x_{i}^{n})=1$ if $x_{i}\cdots x_{i+|w_{j}|-1}=w_{j}$ else 0 for all $i,j$ . For $a_{1}+\cdots+a_{l}\leq n$ , let

[TABLE]

where $0!=1$ . Let $P$ be a probability on ${\mathcal{A}}$ , i.e., $0\leq P(a)\leq 1$ for $a\in{\mathcal{A}}$ and $\sum_{a\in{\mathcal{A}}}P(a)=1$ . Set $P(w)=\prod P(a_{i})$ for $w=a_{1}\cdots a_{|w|},\ a_{i}\in{\mathcal{A}}$ . For example $P(w)=2^{-|w|}$ for all $w$ if $P(0)=P(1)=1/2$ for ${\mathcal{A}}=\{0,1\}$ .

Theorem 2.1

Let ${\mathcal{A}}$ be a finite alphabet and $P$ a probability on ${\mathcal{A}}$ . Let $X_{1}^{n}:=X_{1}X_{2}\cdots X_{n}$ be ${\mathcal{A}}$ -valued i.i.d. random variables from $P(X_{i}=a)=P(a)$ for $a\in{\mathcal{A}}$ . Let $w_{1},\ldots,w_{h}$ be nonoverlapping. Let

[TABLE]

Then

[TABLE]

Proof) We prove the theorem for $h=2$ . The proof for the general case is similar. The number of possible allocations such that $w_{1}$ and $w_{2}$ appear $k_{1}$ and $k_{2}$ times respectively without overlapping in the strings of length $n$ is

[TABLE]

This is because if we replace $w_{1}$ and $w_{2}$ with additional extra symbols $\alpha$ and $\beta$ in the strings of length $n$ then the problem reduces to choosing $k_{1}$ $\alpha$ s and $k_{2}$ $\beta$ s among the strings of length $n-|w_{1}|k_{1}-|w_{2}|k_{2}+k_{1}+k_{2}$ . Let

[TABLE]

The function $A$ is not the probability of $k_{1}$ $w_{1}$ s and $k_{2}$ $w_{2}$ s occurrences in the string, since we allow any letters in the remaining place except for $w_{1}$ s and $w_{2}$ s. Let $B(t_{1},t_{2})$ be the probability that $w_{1}$ and $w_{2}$ appear $k_{1}$ and $k_{2}$ times, respectively. We have the following identity,

[TABLE]

Then

[TABLE]

We have

[TABLE]

and (1) . ∎

Régnier & Szpankowski (1998) show expectation, variance, and central limit theorems for the occurrences of words. Rukhin & Volkovich (2008) study chi-squared tests with nonoverlapping words. We give all orders of moments for nonoverlapping words. Let $A_{t,s}\colonequals\sum_{r}\dbinom{s}{r}r^{t}(-1)^{s-r}$ for all $t,s=1,2,\ldots$ . Then $A_{t,s}$ is the number of surjective functions from $\{1,2,\ldots,t\}\to\{1,2,\ldots,s\}$ for all $t,s$ , see pp.100 Problem 1 Riordan (1958). Let $\lfloor x\rfloor$ be the greatest integer less than or equal to $x$ .

Theorem 2.2

Let $w$ be nonoverlapping. Under the same assumption with $h=1$ in Theorem 2.1,

[TABLE]

for all $t=1,2,\ldots$ .

Proof) Let $Y_{i}=I_{X_{i}^{i+|w|-1}=w}$ . We say that $\{i,i+1,\ldots,i+|w|-1\}$ is the support of $Y_{i}$ . $Y_{n(1)},\ldots,Y_{n(s)}$ are called disjoint if their support are disjoint, where $1\leq n(j)\leq i+|w|-1,1\leq j\leq s$ for $s\in{\mathbb{N}}$ . Since $w$ is nonoverlapping, we have

[TABLE]

Let $Y_{j,i}=Y_{i}$ for all $1\leq j\leq t$ . Then

[TABLE]

By (3), $E(\prod_{j=1}^{t}Y_{j,n(j)})=P^{s}(w)$ if and only if there is a disjoint set $Y_{l(1)},\ldots,Y_{l(s)}$ such that $\{Y_{1,n(1)},\ldots,Y_{t,n(t)}\}=\{Y_{l(1)},\ldots,Y_{l(s)}\}$ .

The number of possible combination of $s$ disjoint $\{Y_{l(1)},\ldots,Y_{l(s)}\}$ is $\dbinom{n-s|w|+s}{s}$ . If $n<s|w|$ then there is no $s$ disjoint $Y_{i}$ s. For each disjoint $\{Y_{l(1)},\ldots,Y_{l(s)}\}$ , the number of possible combination of $n_{1},\ldots,n_{t}$ such that $\{Y_{1,n(1)},\ldots,Y_{t,n(t)}\}=\{Y_{l(1)},\ldots,Y_{l(s)}\}$ is $A_{t,s}$ . By (4), we have the theorem. ∎

3 Explicit formulae for distributions of runs

First we show probability functions for increasing nonoverlapping words.

Let

[TABLE]

For example ${\mathbf{N}}(100,1000;1010001)=(1,1)$ and ${\mathbf{N}}^{\prime}(100,1000;1010001)=(0,1)$ . We write $x\sqsubset y$ if $x$ is a prefix of $y$ and $x\neq y$ . For example $10\sqsubset 100$ . If $w_{1}\sqsubset w_{2}$ and $(k_{1},k_{2})={\mathbf{N}}(w_{1},w_{2};x)$ then $k_{1}\geq k_{2}$ for all $x$ .

Definition 3.1

Let

[TABLE]

where $w_{1}\sqsubset w_{2}\cdots\sqsubset w_{h}$ be increasing nonoverlapping words.

Theorem 3.2

Let ${\mathcal{A}}$ be a finite alphabet and $P$ a probability on ${\mathcal{A}}$ . Let $X_{1},X_{2},\ldots,$ be ${\mathcal{A}}$ -valued i.i.d. finite valued random variables from $P(X_{i}=a)=P(a)$ for $a\in{\mathcal{A}}$ . Let $w_{1}\sqsubset w_{2}\cdots\sqsubset w_{h}$ be increasing nonoverlapping words and

[TABLE]

Then

[TABLE]

Proof) We show (7) for $h=2$ . The proof of the general case is similar. Observe that

[TABLE]

Then

[TABLE]

Next, set $z_{1}=X,z_{2}=X(X+1),\ldots,z_{h}=X(X+1)^{h-1}$ in (7). Then

[TABLE]

By setting $Y=X+1$ in (10), we have

[TABLE]

Since

[TABLE]

$P(\sum ik_{i}=t)$ is the coefficient of $Y^{t}$ in $F_{B}$ . On the other hand, by expanding the left-hand-side of (11), we have

[TABLE]

and (8). ∎

Eq. (7) is an inclusion-exclusion principle for increasing nonoverlapping words.

To derive a universal formula for probability functions of runs, we introduce a statistics that represents various types of runs.

Definition 3.3

For $x\in\{0,1\}^{n}$ , let

[TABLE]

where $m_{1}<\ldots<m_{h}$ .

Example 3.4

Consider a run $0^{3}$ and let $x=0\ 0\ 0\ 1\ 0\ 0\ 0\ 0\ 0\ 0\ 0\ 1\ 0\ 0\ 0\ 0$ .

1. Let $m_{i}=0^{3i}$ for $1\leq i\leq 5$ . Then ${{\mathbf{N}^{\prime}}}(10^{3},10^{6},\ldots,10^{15};1x)=(2,1,0,\ldots,0)$ and $D_{16,(3,6,\ldots,15)}(x)=\sum ik_{i}=2+2\cdot 1=4=N_{16,3}(x)$ (0-overlapping enumeration).

2. Let $m_{i}=0^{3+2(i-1)}=0^{2i+1}$ for $1\leq i\leq 7$ . Then ${{\mathbf{N}^{\prime}}}(10^{3},10^{5},\ldots,10^{15};1x)=(2,0,1,0,\ldots,0)$ and $D_{16,(3,5,\ldots,15)}(x)=\sum ik_{i}=2+3\cdot 1=5$ (1-overlapping enumeration).

3. Let $m_{i}=0^{3+i-1}=0^{2+i}$ for $i=1,2,\ldots,14$ and Then ${{\mathbf{N}^{\prime}}}(10^{3},10^{4},\ldots,10^{16};1x)=(1,2,0,0,1,0,\ldots,0)$ and $D_{16,(3,4,\ldots,16)}(x)=\sum ik_{i}=1+2\cdot 2+5\cdot 1=8=M_{16,3}(x)$ (2-overlapping enumeration).

4. Let $m_{1}=0^{3}$ . Then ${{\mathbf{N}^{\prime}}}(10^{3};1x)=(3)$ and $D_{16,(3)}(x)=3=G_{16,3}(x)$ .

When $w_{i}=10^{m_{i}}$ , the difference between $D_{n}$ and $C_{n}$ is that $D_{n}$ count $0^{m}$ for $m\geq m_{1}$ from the beginning of $x$ while $C_{n}$ does not.

Lemma 3.5

Let $X_{1},X_{2},\ldots,$ be i.i.d. binary random variables from $P(X_{i}=1)=q$ and $P(X_{i}=0)=p$ for all $i$ . Let $m_{1}<\ldots<m_{h}$ and $w_{i}=10^{m_{i}}$ for $1\leq i\leq h$ . Then for all $t\geq 0$ ,

[TABLE]

Proof) Observe that

[TABLE]

We have

[TABLE]

∎

Theorem 3.6 (main theorem)

Let $X_{1},X_{2},\ldots,$ be i.i.d. binary random variables from $P(X_{i}=1)=q$ and $P(X_{i}=0)=p$ for all $i$ . Let $m_{1}<\ldots<m_{h}$ and $w_{i}=10^{m_{i}}$ for $1\leq i\leq h$ . Then for all $t\geq 0$ ,

[TABLE]

Proof) Part 1 follows from Theorem 3.2 and Lemma 3.5. Part 2 follows from part 1. Part 3 follows from $P(G_{n,m}=t)=P(D_{n,(m)}=t)$ . Part 4 follows from $P(T_{m}>n)=P(L_{n}<m)=P(G_{n,m}=0)$ .

Proof of part 5. Let $h=2$ , $w_{1}=10^{m}$ , and $w_{2}=10^{m+1}$ in Theorem 3.2. By (7), we have

[TABLE]

Set $z_{1}=x-1$ and $z_{2}=1-x$ . We have

[TABLE]

where $\bar{E}_{n,m}(x)$ , the number of $10^{m}$ of size exactly $m+1$ in $x$ .

On the other hand,

[TABLE]

Since $P^{k_{1}}(w_{1})P^{k_{2}}(w_{2})=q^{k_{1}+k_{2}}p^{k_{1}m_{1}+k_{2}(m+1)}$ , from (14) and (15), we have

[TABLE]

By similar manner to Lemma 3.5, we have part 5. ∎

Remark 3.7

It is straightforward to extend i.i.d. binary valued random variables in Theorem 3.6 to those of countable values. Let $q_{j}$ , $j=0,1,\ldots$ be a sequence of non-negative reals such that $\sum_{j}q_{j}=1$ and $Y_{1},Y_{2},\ldots Y_{n}\in\{0,1,2,\ldots\}$ be i.i.d. trials from $Q(Y_{i}=j)=q_{j}$ for all $i,j$ . Let $X_{1},\ldots,X_{n}$ are binary i.i.d. trials from $P(X_{i}=1)=1-q_{0}$ and $P(X_{i}=0)=q_{0}$ for all $i$ . Then $Q(D_{n,(m_{1},\ldots,m_{h})}(Y_{1}^{n})=t)=P(D_{n,(m_{1},\ldots,m_{h})}(X_{1}^{n})=t)$ for all $t$ .

4 Distance of distributions

We show that $P(C_{n,(w_{1},\ldots,w_{d})}=t)$ and $P(D_{n,(m_{1},\ldots,m_{d})}=t)$ uniformly converge to $P(C_{n,(w_{1},\ldots,w_{h})}=t)$ and $P(D_{n,(m_{1},\ldots,m_{h})}=t)$ as $d\to h$ , respectively.

Proposition 4.1

Let $X_{1},\ldots,X_{n}$ be i.i.d. binary random variables from $P(X_{i}=0)=p$ . Assume $d<h$ . Then

[TABLE]

Proof) Assume that ${\mathbf{N}}^{\prime}(w_{1},\ldots,w_{h};x_{1}^{n})=(k_{1},k_{2},\ldots,k_{h})$ . By (5), $C_{n,(w_{1},\ldots,w_{d})}(x_{1}^{n})=C_{n,(w_{1},\ldots,w_{h})}(x_{1}^{n})$ if $k_{d+1}=\cdots=k_{h}=0$ . Then for all $t$ ,

[TABLE]

Let $w_{i}=10^{m_{i}}$ for all $i$ . By Theorem 3.6, for all $t$ ,

[TABLE]

where the last inequality follows from (16) and $P(w_{d+1})=qp^{m_{d+1}}$ . ∎

Assume that $X_{1},\ldots,X_{n}$ be i.i.d. binary random variables from $P(X_{i}=0)=0.5$ . Let

[TABLE]

Table 1 shows numerical calculations of $\operatorname{dist}(d,995|40)$ for $n=1000,d=1,3,5,7,9$ , and $m_{i}=5+i$ for $i=1,2,\ldots,995$ . Figure 1 shows graphs of $P(D_{n,(m_{1},\ldots,m_{d})}(X_{1}^{n})=t)$ for $d=1,2,3$ , and $995$ .

5 Algorithm and computational complexity

We study algorithm and computational complexity to compute (8). The basic idea of our algorithm is similar to that of bucket sort (Cormen et al. (2009)). When $P(C_{n,(m_{1},\ldots,m_{h})}>t)$ is negligible for some $t$ , it is suffice to compute $P(C_{n})=s$ for $s=0,\ldots,t$ . The following Algorithm A compute $P(C_{n})=s$ for all $s=0,\ldots,t$ .

Let ${\mathbb{Z}}_{\geq 0}=\{0,1,2,\ldots\}$ and

[TABLE]

**Algorithm A

**1. Initialize $P(C_{n}=s)=0$ for all $s=0,\ldots,t$ .

2. Enumerate all nonnegative vectors $(r,k_{1},\ldots,k_{d})\in G(n,t,|w_{1}|,\ldots,|w_{d}|)$ .

For each vector $(r,k_{1},\ldots,k_{d})\in G(n,t,|w_{1}|,\ldots,|w_{d}|)$ , set

[TABLE]

where $s=\sum ik_{i}-r$ .

3. Output $P(C_{n}=s)$ for all $s=0,\ldots,t$ .

Since Algorithm A enumerates all combination of $r,k_{1},\ldots,k_{d}$ for given $h=0,\ldots,t$ in (8), Algorithm A correctly computes $P(C_{n,(w_{1},\ldots,w_{d})}(X_{1}^{n})=h)$ for all $h=0,\ldots,t$ .

The bottle neck of computational complexity of Algorithm A is the size of $G$ . In Theorem 5.2, we give an upper bound of the size of $G$ . Algorithm and computational complexity for computing $D_{n}$ is similar to that of $C_{n}$ .

First we prove a lemma. Let $|S|$ be the number of the elements of a finite set $S$ and

[TABLE]

Lemma 5.1

For all $n,|w_{1}|,\ldots,|w_{d}|$ ,

[TABLE]

Proof) We prove the lemma by induction on $d$ . If $d=1$ , $F(n,|w_{1}|)=\lfloor\frac{n}{|w_{1}|}\rfloor+1\leq\frac{n}{|w_{1}|}+1$ and (18) is true. Assume that (18) is true for some $d$ . Let $f(x)=\frac{(n+\sum_{1\leq i\leq d}|w_{i}|-(x-1)|w_{d+1}|)^{d}}{d!\prod_{i}|w_{i}|}$ for $0\leq x\leq\frac{n+\sum|w_{i}|}{|w_{d+1}|}+1$ . Since

[TABLE]

we have $F(n,|w_{1}|,\ldots,|w_{d+1}|)\leq f(1)+f(2)+\cdots f(h)$ , where $h=\lfloor\frac{n}{|w_{d+1}|}\rfloor+1$ . Let $\lceil x\rceil$ be the least integer that is greater than or equal to $x$ . Since $f$ is convex, we have

[TABLE]

and (18) is true for $d+1$ . By induction, we have the lemma. ∎

Theorem 5.2

*Let $n$ be the sample size. For given $n,d,t,$ and $w_{1},\ldots,w_{d}$ ,

1.*

[TABLE]

2. Fix $d$ and $t$ . Then

[TABLE]

3. Fix $d$ . Then

[TABLE]

In particular if $t=O(n^{\frac{\alpha}{d}})$ and $\alpha>0$ . Then

[TABLE]

4. Let $\alpha,\beta,\gamma$ be positive constants, $t\sim\alpha d$ , and $d\sim\frac{\log n+\beta}{\gamma}$ . Then

[TABLE]

In particular, $\gamma=-\log p$ and $\beta=(r+1)\log 2+\log p$ then for all $d<h$ and $r>0$ ,

[TABLE]

Proof) Let $(k_{1},\ldots,k_{d})\in{\mathbb{Z}}^{d}_{\geq 0}$ . Then

[TABLE]

By Lemma 18, the number of $(k_{2},\ldots,k_{d})$ that satisfies (19) is less than or equal to $\frac{(n+d(d-1)/2)^{d-1}}{((d-1)!)^{2}}$ . By (17), we have $|w_{1}|k_{1}\leq\sum|w_{i}|k_{i}\leq n$ , and the number of $(k_{1},\ldots,k_{d})$ such that $(r,k_{1},\ldots,k_{d})\in G(n,t,|w_{1}|,\ldots,|w_{d}|)\text{ for some }r\geq 0$ is less than or equal to $(1+\frac{n}{|w_{1}|})\frac{(n+d(d-1)/2)^{d-1}}{((d-1)!)^{2}}$ . By (17), the number of possible $r$ such that $(r,k_{1},\ldots,k_{d})\in G$ for each fixed $k_{1},\ldots,k_{d}$ is less than or equal to $\sum k_{i}-\sum ik_{i}+t+1\leq t+1$ , and we have part 1.

Part 2 and 3 follow from part 1.

Proof of part 4. Let $t=\alpha d(1+o(1))$ . Then

[TABLE]

where (20) follows from $\log(1+x)\leq x$ , and (21) follows from Stirling formula $d!\sim(2\pi)^{\frac{1}{2}}e^{-d}d^{d+\frac{1}{2}}$ . Let $d\sim\frac{\log n+\beta}{\gamma}$ . By (21), we have $\frac{1}{(d!)^{2}}(t+\frac{1}{2}d(d+1))^{d}=O(n^{(2-\log 2)/\gamma}/d)$ . By part 1 we have part 4. ∎

Remark 5.3

In Theorem 5.2 4, if $\gamma=-\log 0.5$ then $2.88<(2-\log 2)/\gamma+1<2.89$ . On the other hand, to compute exact distributions by Markov imbedding method, we need to calculate $M^{n}$ for sample size $n$ and $m\times m$ matrix $M$ with $m=O(n)$ . The number of arithmetical operations to compute $M^{2}$ is $O(n^{2.81})$ and those of $M^{n}$ is $O(n^{2.81}\log n)$ with Strassen algorithm (Cormen et al. (2009)).

Acknowledgement

This work was supported by the Research Institute for Mathematical Sciences, an International Joint Usage/Research Center located in Kyoto University. The author thanks Prof. Shigeki Akiyama (Tsukuba Univ.) for discussions.

Bibliography40

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1)
2Aki & Hirano (2000) Aki, S. & Hirano, K. (2000), ‘Numbers of success-runs of specified length until certain stopping time rules and generalized binomial distributions of order k’, Ann. Inst. Statist. Math. 52 (4), 767–777.
3Aki et al. (1984) Aki, S., Kuboki, H. & Hirano, K. (1984), ‘On discrete distributions of order k’, Ann. Inst. Statist. Math. 36 , 431–440.
4Antzoulakos & Chadjiconstantindis (2001) Antzoulakos, D. L. & Chadjiconstantindis, S. (2001), ‘Distributions of numbers of success runs of fixed length in Markov dependent trials’, Ann. Inst. Statist. Math. 53 (3), 599–619.
5Balakrishnan & Koutras (2002) Balakrishnan, N. & Koutras, M. V. (2002), Runs and scans with applications , John Wiley & Sons.
6Bassino et al. (2010) Bassino, F., Clément, J. & Micodème, P. (2010), ‘Counting occurrences for a finite set of words: combinatorial methods’, ACM Trans. Algorithms. 9 (4), Article No. 31.
7Berthé & Rigo (2016) Berthé, V. & Rigo, M. (2016), Combinatorics, words and symbolic dynamics , Encyclopedia of Mathematics and Its Applications 159, Cambridge University Press.
8Blom & Thorburn (1982) Blom, G. & Thorburn, D. (1982), ‘How many random digits are required until given sequences are obtained?’, J. Appl. Probab. 19 (3), 518–531.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Universal parameterized family of distributions of runs

Abstract

1 Introduction

2 Joint distributions of nonoverlapping words

Theorem 2.1

Theorem 2.2

3 Explicit formulae for distributions of runs

Definition 3.1

Theorem 3.2

Definition 3.3

Example 3.4

Lemma 3.5

Theorem 3.6** **(main theorem)

Remark 3.7

4 Distance of distributions

Proposition 4.1

5 Algorithm and computational complexity

Lemma 5.1

Theorem 5.2

Remark 5.3

Theorem 3.6 (main theorem)