Non asymptotic distributional bounds for the Dickman Approximation of   the running time of the Quickselect algorithm

Larry Goldstein

arXiv:1703.00505·math.PR·October 22, 2018

Non asymptotic distributional bounds for the Dickman Approximation of the running time of the Quickselect algorithm

Larry Goldstein

PDF

TL;DR

This paper establishes non-asymptotic bounds on the distributional approximation of the Quickselect algorithm's running time using the Dickman distribution, providing explicit convergence rates and insights into the approximation's accuracy.

Contribution

It introduces a new Wasserstein distance bound for the Dickman approximation and applies it to derive explicit, non-asymptotic convergence rates for Quickselect's running time distribution.

Findings

01

Derived explicit bounds for the distributional distance between Quickselect's running time and the Dickman distribution.

02

Proved the rate of convergence is optimal for certain parameters, matching known asymptotic results.

03

Provided exact expressions and lower bounds for the expected running time of Quickselect.

Abstract

Given a non-negative random variable $W$ and $θ > 0$ , let the generalized Dickman transformation map the distribution of $W$ to that of $W^{*} =_{d} U^{1/ θ} (W + 1),$ where $U \sim U [0, 1]$ , a uniformly distributed variable on the unit interval, independent of $W$ , and where $=_{d}$ denotes equality in distribution. It is well known that $W^{*}$ and $W$ are equal in distribution if and only if $W$ has the generalized Dickman distribution $D_{θ}$ . We demonstrate that the Wasserstein distance $d_{1}$ between $W$ , a non-negative random variable with finite mean, and $D_{θ}$ having distribution $D_{θ}$ obeys the inequality $d_{1} (W, D_{θ}) \leq (1 + θ) d_{1} (W, W^{*}) .$ The specialization of this bound to the case $θ = 1$ and coupling constructions yield $$ d_1(W_{n,1},D) \le \frac{8\log (n/2)+10}{n} \quad \mbox{for all $n \ge 1$, where} \quad…

Equations160

W^{*} =_{d} U^{1/ θ} (W + 1),

W^{*} =_{d} U^{1/ θ} (W + 1),

d_{1} (W, D_{θ}) \leq (1 + θ) d_{1} (W, W^{*}) .

d_{1} (W, D_{θ}) \leq (1 + θ) d_{1} (W, W^{*}) .

\displaystyle d_{1}(W_{n,1},D_{1})\leq\frac{8\log(n/2)+10}{n}\quad\mbox{for all $n\geq 1$, where for $m\geq 1$}\quad W_{n,m}=\frac{1}{n}C_{n,m}-1,

\displaystyle d_{1}(W_{n,1},D_{1})\leq\frac{8\log(n/2)+10}{n}\quad\mbox{for all $n\geq 1$, where for $m\geq 1$}\quad W_{n,m}=\frac{1}{n}C_{n,m}-1,

W^{*} =_{d} U^{1/ θ} (W + 1),

W^{*} =_{d} U^{1/ θ} (W + 1),

W \sim D_{θ} \mbox i f an d o n l y i f W =_{d} W^{*} .

W \sim D_{θ} \mbox i f an d o n l y i f W =_{d} W^{*} .

W_{n, m} = \frac{1}{n} C_{n, m} - 1,

W_{n, m} = \frac{1}{n} C_{n, m} - 1,

d_{1} (W_{n, 1}, D) \leq \frac{8 lo g ( n /2 ) + 10}{n} .

d_{1} (W_{n, 1}, D) \leq \frac{8 lo g ( n /2 ) + 10}{n} .

d_{1} (W_{n, m}, D) \leq \frac{( 46 m + 8 ) lo g ( n / m ) + 54 m + 8}{n} .

d_{1} (W_{n, m}, D) \leq \frac{( 46 m + 8 ) lo g ( n / m ) + 54 m + 8}{n} .

\displaystyle d_{1}(W_{n,m},D)\geq\frac{2(|m-2|\log n-|(m+2)h_{m}-3|)}{n}\quad\mbox{for all $n\geq m$.}

\displaystyle d_{1}(W_{n,m},D)\geq\frac{2(|m-2|\log n-|(m+2)h_{m}-3|)}{n}\quad\mbox{for all $n\geq m$.}

E [C_{n, m}] = 2 [n + 3 + (n + 1) h_{n} - (m + 2) h_{m} - (n - m + 3) h_{n - m + 1}] .

E [C_{n, m}] = 2 [n + 3 + (n + 1) h_{n} - (m + 2) h_{m} - (n - m + 3) h_{n - m + 1}] .

E [C_{n, 1}]

E [C_{n, 1}]

E [C_{n, 2}]

E [C_{n, 3}]

E [C_{n, 4}]

d_{1} (X, Y) = h \in Lip_{1} sup ∣ E h (X) - E h (Y) ∣ \mbox w h er e Lip_{1} = {h : ∣ h (y) - h (x) ∣ \leq ∣ y - x ∣} .

d_{1} (X, Y) = h \in Lip_{1} sup ∣ E h (X) - E h (Y) ∣ \mbox w h er e Lip_{1} = {h : ∣ h (y) - h (x) ∣ \leq ∣ y - x ∣} .

d_{1} (W, D_{θ}) \leq (1 + θ) d_{1} (W^{*}, W) .

d_{1} (W, D_{θ}) \leq (1 + θ) d_{1} (W^{*}, W) .

d_{1} (X, Y) = in f E ∣ X - Y ∣

d_{1} (X, Y) = in f E ∣ X - Y ∣

d_{1} (W, D_{θ}) \leq (1 + θ) E ∣ W^{*} - W ∣

d_{1} (W, D_{θ}) \leq (1 + θ) E ∣ W^{*} - W ∣

\displaystyle C_{n}=n-1+C_{V_{1}}\quad\mbox{for $n\geq 1$, with boundary condition $C_{0}=0$,}\quad

\displaystyle C_{n}=n-1+C_{V_{1}}\quad\mbox{for $n\geq 1$, with boundary condition $C_{0}=0$,}\quad

\displaystyle\textbf{U}_{k}=(U_{k},U_{k+1},\ldots)\quad\mbox{for $k\geq 1$,}\quad

\displaystyle\textbf{U}_{k}=(U_{k},U_{k+1},\ldots)\quad\mbox{for $k\geq 1$,}\quad

\displaystyle V_{k}=\lfloor V_{k-1}U_{k}\rfloor\quad\mbox{for $k\geq 1$}

\displaystyle V_{k}=\lfloor V_{k-1}U_{k}\rfloor\quad\mbox{for $k\geq 1$}

\displaystyle C(n;\textbf{U}_{1})=n-1+C(\lfloor nU_{1}\rfloor;\textbf{U}_{2})\quad\mbox{for $n\geq 1$, with}\quad C(0;\textbf{U}_{k})=0\quad\mbox{for all $k\geq 1$.}\quad

\displaystyle C(n;\textbf{U}_{1})=n-1+C(\lfloor nU_{1}\rfloor;\textbf{U}_{2})\quad\mbox{for $n\geq 1$, with}\quad C(0;\textbf{U}_{k})=0\quad\mbox{for all $k\geq 1$.}\quad

\displaystyle e_{n}\leq c+\frac{1}{n}\sum_{u=q}^{n-1}e_{u}\quad\mbox{for all $n\geq q$,}

\displaystyle e_{n}\leq c+\frac{1}{n}\sum_{u=q}^{n-1}e_{u}\quad\mbox{for all $n\geq q$,}

\displaystyle e_{n}\leq c\log(en/q)\quad\mbox{for $n\geq q$.}\quad

\displaystyle e_{n}\leq c\log(en/q)\quad\mbox{for $n\geq q$.}\quad

e_{n} \leq c + \frac{1}{n} u = q \sum n - 1 e_{u} \leq c + \frac{c}{n} u = q \sum n - 1 lo g (e u / q) \leq c + \frac{c}{n} \int_{q}^{n} lo g (e u / q) d u = c (1 + \frac{1}{n} [u lo g (e u / q) - u] \frac{1}{1}_{q}^{n}) = c (1 + \frac{1}{n} [n lo g (e n / q) - n]) = c lo g (e n / q),

e_{n} \leq c + \frac{1}{n} u = q \sum n - 1 e_{u} \leq c + \frac{c}{n} u = q \sum n - 1 lo g (e u / q) \leq c + \frac{c}{n} \int_{q}^{n} lo g (e u / q) d u = c (1 + \frac{1}{n} [u lo g (e u / q) - u] \frac{1}{1}_{q}^{n}) = c (1 + \frac{1}{n} [n lo g (e n / q) - n]) = c lo g (e n / q),

W_{n} = \frac{1}{n} C (n; U_{1}) - 1 = \frac{1}{n} (n - 1 + C (V_{1}; U_{2})) - 1 = \frac{1}{n} (C (V_{1}; U_{2}) - 1) .

W_{n} = \frac{1}{n} C (n; U_{1}) - 1 = \frac{1}{n} (n - 1 + C (V_{1}; U_{2})) - 1 = \frac{1}{n} (C (V_{1}; U_{2}) - 1) .

W_{n}^{'} := \frac{1}{n} C (n, U_{2}) - 1 =_{d} \frac{1}{n} C (n, U_{1}) - 1 = W_{n},

W_{n}^{'} := \frac{1}{n} C (n, U_{2}) - 1 =_{d} \frac{1}{n} C (n, U_{1}) - 1 = W_{n},

W_{n}^{*} := U_{1} (W_{n}^{'} + 1) = \frac{1}{n} U_{1} C (n; U_{2})

W_{n}^{*} := U_{1} (W_{n}^{'} + 1) = \frac{1}{n} U_{1} C (n; U_{2})

W_{n}^{*} - W_{n} = \frac{1}{n} (U_{1} C (n; U_{2}) - C (V_{1}; U_{2}) + 1)

W_{n}^{*} - W_{n} = \frac{1}{n} (U_{1} C (n; U_{2}) - C (V_{1}; U_{2}) + 1)

n E ∣ W_{n}^{*} - W_{n} ∣ \leq e_{n} + 1, \mbox w h er e w ese t e_{k} = E ∣ U_{1} C (k; U_{2}) - C (⌊ k U_{1} ⌋; U_{2}) ∣, k \geq 0,

n E ∣ W_{n}^{*} - W_{n} ∣ \leq e_{n} + 1, \mbox w h er e w ese t e_{k} = E ∣ U_{1} C (k; U_{2}) - C (⌊ k U_{1} ⌋; U_{2}) ∣, k \geq 0,

d_{1} (W_{n}, D) \leq 2 E ∣ W_{n}^{*} - W_{n} ∣ \leq \frac{2}{n} (e_{n} + 1) .

d_{1} (W_{n}, D) \leq 2 E ∣ W_{n}^{*} - W_{n} ∣ \leq \frac{2}{n} (e_{n} + 1) .

e_{n} = E ∣ U_{1} C (n; U_{2}) - C (⌊ n U_{1} ⌋; U_{2}) ∣ \leq E ∣ U_{1} (n - 1) - ⌊ n U_{1} ⌋ + 1∣ + E ∣ U_{1} C (⌊ n U_{2} ⌋; U_{3}) - C (⌊⌊ n U_{1} ⌋ U_{2} ⌋; U_{3}) ∣.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

00footnotetext: This work was partially supported by NSA grant H98230-15-1-0250.00footnotetext: MSC 2010 subject classifications: Primary 60F05, 68Q25 00footnotetext: Key words and phrases: sorting, complexity, distributional approximation

Non-asymptotic distributional bounds for the Dickman approximation of the running time of the Quickselect algorithm

Larry Goldstein

(Department of Mathematics, University of Southern California)

Abstract

Given a non-negative random variable $W$ and $\theta>0$ , let the generalized Dickman transformation map the distribution of $W$ to that of

[TABLE]

where $U\sim{\cal U}[0,1]$ , a uniformly distributed variable on the unit interval, independent of $W$ , and where $=_{d}$ denotes equality in distribution. It is well known that $W^{*}$ and $W$ are equal in distribution if and only if $W$ has the generalized Dickman distribution ${\cal D}_{\theta}$ . We demonstrate that the Wasserstein distance $d_{1}$ between $W$ , a non-negative random variable with finite mean, and $D_{\theta}$ having distribution ${\cal D}_{\theta}$ obeys the inequality

[TABLE]

The specialization of this bound to the case $\theta=1$ and coupling constructions yield

[TABLE]

and $C_{n,m}$ is the number of comparisons made by the Quickselect algorithm to find the $m^{th}$ smallest element of a list of $n$ distinct numbers. A similar bound holds for $W_{n,m}$ for $m\geq 2$ , and together recover and quantify the results of [12] that show distributional convergence of $W_{n,m}$ to the standard Dickman distribution ${\cal D}_{1}$ in the asymptotic regime $m=o(n)$ . By comparison to an exact expression for the expected running time $E[C_{n,m}]$ , lower bounds are provided that show the rate is not improvable for $m\not=2$ .

1 Introduction

For a given non-negative random variable $W$ and $\theta>0$ , let the generalized Dickman transformation map the distribution of $W$ to that of

[TABLE]

where $U$ has the uniform distribution ${\cal U}[0,1]$ on the unit interval, and is independent of $W$ and where $=_{d}$ denotes equality in distribution. It is well known [6], [15] that the generalized Dickman distribution ${\cal D}_{\theta}$ is the unique fixed point of the transformation (1), that is,

[TABLE]

When (1) holds we will say that $W^{*}$ has the ${\cal D}_{\theta}$ -bias distribution of $W$ . In what follows, $D_{\theta}$ will denote a random variable with distribution ${\cal D}_{\theta}$ . The case $\theta=1$ corresponds to the (standard) Dickman distribution, for which we may drop the subscript $\theta$ .

The Dickman function $\rho$ first made its appearance in number theory [7] when counting the number of integers below a fixed threshold whose prime factors satisfy some given upper bound. Standardizing $\rho$ yields the density of the standard Dickman distribution, the cannonical member of the family ${\cal D}_{\theta},\theta>0$ of generalized Dickman distributions, which also arise in the study of component counts of logarithmic combinatorial structures such as permutations and partitions [1], and more generally for the quasi-logarithmic class considered in [3]. See also the recent work [17], [2] and [4] in this area, that detail some connections to probabilistic number theory.

Members from the generalized Dickman family have subsequently been noted to arise in a variety of other contexts, in particular for the sum of edge lengths of vertices connected to the origin in minimal directed spanning trees in [15], and for weighted sums of independent random variables in [16], [2] and [4]. Simulation of the Dickman distribution has been considered in [6].

Here we study the error incurred when using the standard Dickman distribution to approximate that of the (properly normalized) number of comparisons made by the Quickselect sorting algorithm of Hoare [11] for locating the $m^{th}$ smallest element of a list of $n$ distinct numbers. One may visualize how Quickselect works in terms of a tree structure. First, a ‘pivot’ is chosen uniformly from the given list. The list is then divided into those numbers on the list that are strictly smaller, making up the left subtree, and those that are strictly larger, making up the right. If the left subtree is of size $m-1$ then the pivot is the desired $m^{th}$ smallest element, and the procedure terminates. Otherwise, the process continues recursively on the left sub-tree if it is of size $m$ or larger, and else on the right sub-tree.

Letting

[TABLE]

where $C_{n,m}$ is the number of comparisons made by Quickselect, the work of [12] showed that $W_{n,m}$ converges in distribution to the Dickman $D$ when $m=o(n)$ . We note that in the case $m=1$ Quickselect simplifies in that at each step of the recursion the procedure either stops or continues on the left subtree. As this case is simpler than for $m\geq 2$ we deal with it separately.

The following two theorems quantify and recover the results of [12] by providing non-asypmptotic bounds in the Wasserstein distance $d_{1}$ between $W_{n,m}$ and $D$ that converge to zero in the $m=o(n)$ asymptotic regime. As the $m^{th}$ smallest number of a list of $n$ distinct numbers only exists when $n\geq m$ , we need only consider this range of parameters in what follows.

Theorem 1.1

Let $C_{n,1}$ be the number of comparisons made by Quickselect to find the smallest of a list of $n$ distinct numbers, and let $W_{n,1}$ be given by (3). Then for all $n\geq 1$

[TABLE]

Theorem 1.2

Let $m\geq 2$ and $C_{n,m}$ the number of comparisons made by Quickselect to find the $m^{th}$ smallest element of a list of $n$ distinct numbers, and let $W_{n,m}$ be given by (3). Then for all $n\geq m$

[TABLE]

That the bounds in Theorems 1.1 and 1.2 are tight in the $\log n/n$ order for $m\not=2$ is a consequence of the following result; in the following, we let $h_{n}=\sum_{1\leq k\leq n}1/k$ for $n\geq 1$ .

Theorem 1.3

For all $m\geq 1$ ,

[TABLE]

We note that in the case $m=1$ the lower bound simplifies to $2\log n/n$ . That our method, where we focus only on the expectation $E[C_{n,m}]$ to achieve our lower bound, does not succeed in the case $m=2$ is explained by the lack of the term $h_{n}$ on the right hand side of (6). Theorem 1.3 is shown using the following exact expression for the expected running time of Quickselect; see also Section 6 of [9].

Theorem 1.4 (Knuth [13])

Let $C_{n,m}$ be the number of comparisons made by Quickselect to locate the $m^{th}$ smallest of $n$ distinct numbers. Then for all $n\geq m\geq 1$

[TABLE]

In particular,

[TABLE]

Theorems 1.1 and 1.2 are derived by applying Theorem 1.5 that quantifies the if direction of the fixed point property (2) in the Wasserstein, or $d_{1}$ metric between two random variables $X$ and $Y$ , given by

[TABLE]

On the left hand side of (7) we have chosen to write $d_{1}(X,Y)$ , rather than the technically correct expression $d_{1}({\cal L}(X),{\cal L}(Y))$ , only for notational convenience.

Theorem 1.5

Let $W$ be a non-negative random variable with finite mean, let $\theta>0$ , and let the law of $W^{*}$ be given by (1). Then

[TABLE]

As the Wasserstein distance also satisfies

[TABLE]

where the infimum is over all couplings $(X,Y)$ having the given marginals, and is achieved here (see [18], for instance), Theorem 1.5 implies that

[TABLE]

for any non-negative random variable $W$ with finite mean, and $W^{*}$ defined on a common space having the ${\cal D}_{\theta}$ -bias distribution of $W$ .

In Section 2 we detail the workings of the Quickselect algorithm and prove Theorems 1.1 and 1.2 by applying Theorem 1.5, which is proved in Section 3. The proof of Theorem 1.3 appears in Section 4.

In related work, [8] considers the Quicksort method, which produces a fully sorted list, and [5] obtains distributional bounds for the running time of a variation of Quickselect to a non-Dickman approximand; compare its characterizion in (1.4) there to (1) here.

2 The Quickselect Method and the Proofs of Theorems 1.1 and 1.2

In this section we apply Theorem 1.5 to obtain the bounds in Theorems 1.1 and 1.2 on the error of the Dickman approximation for the distribution of $W_{n,m}$ in (3), the properly normalized running time of the Quickselect algorithm for finding the $m^{th}$ smallest element of a list of $n$ distinct numbers. When the value of $m$ is clear from context, we will write $C_{n}$ for $C_{n,m}$ .

2.1 Quickselect: the case $m=1$

In this section we prove Theorem 1.1 for the distribution of the number $C_{n}$ of comparisons that Quickselect requires to locate the smallest element of a list of $n$ distinct numbers. Clearly, a list of size zero requires no comparisons, hence $C_{0}=0$ . For $n\geq 1$ , the procedure requires the $n-1$ comparisons of the pivot to every other element at the first stage, followed by the cost of processing the left subtree, which may be empty. Since the pivot is chosen uniformly, we obtain the stochastic recursion

[TABLE]

where $V_{1}$ , the size of the left subtree, is a discrete uniform variable on $\{0,\ldots,n-1\}$ . From (10) we see that $C_{1}=0$ and $C_{2}=1$ a.s., and that non-trivial distributions arise for $n\geq 3$ .

Before proceeding to the proof of the theorem we describe how for all $n\geq 1$ we may write $C_{n}$ as a function $C(n;\textbf{U}_{1})$ with

[TABLE]

and $U_{1},U_{2},\ldots$ a sequence of i.i.d. uniform variables on $[0,1]$ . Consider the initial list of size $V_{0}=n$ as making up the left subtree at stage 0. At stage $k\geq 1$ , given a non-null left subtree from the previous stage of size $V_{k-1}$ , a new left subtree of size

[TABLE]

results by choosing a pivot uniformly from the current left subtree. In particular, the conditional distribution of $V_{k}$ given $V_{k-1}$ satisfies $V_{k}\sim{\cal U}\{0,\ldots,V_{k-1}-1\}$ . Rewriting (10) in this notation we have

[TABLE]

As the size of each non-null left subtree decrements by at least one at each iteration, the value of $C_{n}$ will only depend on an initial subsequence of $\textbf{U}_{1}$ of length at most $n$ .

We pause to prove a lemma that is needed in this and the following section.

Lemma 2.1

If for $c$ a non-negative number and $q$ a positive integer

[TABLE]

then

[TABLE]

Proof: As (13) holds for $n=q$ we see that $e_{q}\leq c$ , verifying that the inequality in (14) holds at $q$ . Assuming inequality (13) holds for $q\leq u\leq n-1$ for some $n\geq q+1$ we have

[TABLE]

completing the inductive step, and the proof. $\Box$

We now prove Theorem 1.1. In the proof, we use Lemmas 2.2 and 2.4, which appear with their proofs at end of this section.

Proof of Theorem 1.1: Take $n\geq 1$ . With $V_{k}$ as in (11), by (12) the variable $W_{n}$ as given by (3) satisfies

[TABLE]

We now construct a variable with the $W_{n}^{*}$ distribution by first constructing $W_{n}^{\prime}$ having the $W_{n}$ distribution. As $\textbf{U}_{1}$ and $\textbf{U}_{2}$ are equidistributed,

[TABLE]

and hence

[TABLE]

has the ${\cal D}$ -bias distribution by (1). The difference

[TABLE]

satisfies

[TABLE]

hence consequence (9) of Theorem 1.5 with $\theta=1$ yields

[TABLE]

We claim that

[TABLE]

When $\lfloor nU_{1}\rfloor\geq 1$ this inequality follows from using the basic recursion (12) on both terms forming the difference that defines $e_{n}$ , followed by applying the triangle inequality, and is easily verified to hold directly in the case $\lfloor nU_{1}\rfloor=0$ by applying (12) only on the first term of that difference, noting the second one in this case is zero. Now using that $|u(n-1)-\lfloor nu\rfloor+1|\leq 2$ for all $u\in[0,1]$ , we obtain

[TABLE]

For the final term, the inequality

[TABLE]

follows by applying Lemma 2.2, below, that shows that $|\lfloor U_{1}\lfloor nU_{2}\rfloor\rfloor-\lfloor U_{2}\lfloor nU_{1}\rfloor\rfloor|\leq 1$ a.s, and Lemma 2.4, also below, that shows that $E|C(p,\textbf{U}_{3})-C(p-1,\textbf{U}_{3})|\leq 2$ for all $p\geq 1$ .

Expanding the expectation in $Ee_{\lfloor nU_{2}\rfloor}$ in (16), using the fact that $\lfloor nU_{2}\rfloor$ is uniformly distributed over $\{0,\ldots,n-1\}$ and that $e_{0}=e_{1}=0$ by virtue of $C_{0}=C_{1}=0$ , we obtain

[TABLE]

As $e_{1}=0$ inequality (15) shows that the claim of the theorem holds for $n=1$ . Applying Lemma 2.1 with $c=4$ and $q=2$ shows that $e_{n}\leq 4\log(en/2)$ for $n\geq 2$ , and substituting this bound into (15) and simplifying now completes the proof. $\Box$

We now prove Lemmas 2.2 and 2.4.

Lemma 2.2

For all $(u_{1},u_{2})\in[0,1)^{2}$ and $n\geq 0$ ,

[TABLE]

Proof: Consider the case $n\geq 1$ , as otherwise the claim is trivial. Let $s=\lfloor nu_{1}\rfloor$ and $t=\lfloor nu_{2}\rfloor$ , so that $(s,t)\in\{0,1,\ldots,n-1\}^{2}$ and

[TABLE]

Then

[TABLE]

Taking the difference,

[TABLE]

As the difference between $u_{1}\lfloor nu_{2}\rfloor$ and $u_{2}\lfloor nu_{1}\rfloor$ is less than 1, their integer parts can differ by at most 1. $\Box$

To prove Lemma 2.4, we will use the easily verified fact that

[TABLE]

and for $u\in[0,1]$ that

[TABLE]

We will also require the following inequality that can be shown directly using induction.

Lemma 2.3

If $c\geq 0,f_{1}=0$ and

[TABLE]

then $f_{p}\leq 2c$ for all $p\geq 1$ .

Lemma 2.4

For all $p\geq 1$

[TABLE]

Proof: As $f_{1}=0$ we need only consider $p\geq 2$ . In view of (17) we may write

[TABLE]

We claim that the conditional expectation in the first sum is 1. Indeed, for the given range of $U_{1}$ the first case of (20) yields $(\lfloor(p-1)U_{1}\rfloor,\lfloor pU_{1}\rfloor)=(k-1,k-1)$ , and now (12) implies that on this event

[TABLE]

For the second sum, the second case of (20) yields $(\lfloor(p-1)U_{1}\rfloor,\lfloor pU_{1}\rfloor)=(k-1,k)$ , and

[TABLE]

Hence,

[TABLE]

Invoking Lemma 2.3 with $c=1$ now completes the proof. $\Box$

2.2 Case of $m\geq 2$

In this section we prove Theorem 1.2 for the approximation of the distribution of the properly scaled value of the number $C_{n,m}$ of comparisons made by the Quickselect algorithm $Q_{m}$ to determine the $m^{th}$ smallest element of a list of $n$ distinct numbers in the case $m\geq 2$ .

As the $m^{th}$ smallest element of the list does not exist when $n<m$ , no comparisons are required and we may set $C_{n,m}=0$ over this range. In the non-trivial case $n\geq m$ , $Q_{m}$ begins as for $m=1$ at the first stage by selecting a uniformly chosen pivot, giving rise, through $n-1$ comparisons to the pivot, to a left subtree of size $V_{1}$ , uniformly distributed over $\{0,\ldots,n-1\}$ , and a right subtree of size $n-1-V_{1}$ . If $V_{1}\geq m$ then the $m^{th}$ smallest element of the original list lies in the left subtree, and we may locate it by applying $Q_{m}$ to it. If $V_{1}=m-1$ then the pivot is the $m^{th}$ smallest element and the process stops. Otherwise $V_{1}<m-1$ , and the $m^{th}$ smallest element is the $m-V_{1}-1^{st}$ smallest element in the right subtree, which we then locate by applying $Q_{m-V_{1}-1}$ to it. Hence, we obtain

[TABLE]

We now develop a simple bound on the expectation $E[C_{n,m}]$ .

Lemma 2.5

Let $C_{n,m}$ be the number of Quickselect comparisons for locating the $m^{th}$ smallest element of a list of $n$ distinct numbers. Then for all $m\geq 1$ ,

[TABLE]

Proof: Recall $h_{n}$ is the harmonic series $\sum_{1\leq k\leq n}1/k$ for $n\geq 1$ . The claim is trivial unless $n\geq m$ , and is also easily seen to be true for $m=1$ and $m=2$ using (5) and (6). Hence, we take $n\geq m\geq 3$ .

For such $n$ and $m$ , writing the difference between the two harmonic series below as a sum and separating out the last term for $j=m-2$ , we have

[TABLE]

the inequality holding since each ratio is bounded by 1. Hence, using the expression given for $E[C_{n,m}]$ in Theorem 4 and applying (22) to yield the first inequality below, we obtain the upper bound

[TABLE]

$\Box$

Note that the indicator on the first term on the right hand side of (21) may be dropped, due to the boundary condition there, on the line above. Now letting $C_{m}(n;\textbf{U}_{1})$ be defined by rewriting (21) as (12) was derived from (10), we obtain

[TABLE]

We next provide the following result that parallels Lemma 2.4 for the case $m=1$ .

Lemma 2.6

For all $m\geq 2$ and $p\geq 1$

[TABLE]

Proof: As $C_{m}(p;\textbf{U}_{1})=0$ for all $0\leq p\leq m-1$ we may take $p\geq m$ . By the basic recursion (23) we have

[TABLE]

Applying the triangle inequality and taking expectation yields

[TABLE]

For the first expectation in (24), by (20) we have

[TABLE]

Now applying Lemma 2.5 on the first term of the remainder $R$ , and using that $\lfloor pU_{1}\rfloor\sim{\cal U}\{0,\ldots,p-1\}$ , yields

[TABLE]

and replacing $p$ by $p-1$ we see that the same bound holds for the expectation of the final term of $R$ .

Substituting the bounds achieved into (24) we obtain

[TABLE]

As $f_{p}=0$ for $1\leq p\leq m-1$ inequality (25) holds for all $p\geq 2$ , and the conditions for invoking Lemma 2.3 with $c=1+8m$ are satisfied, yielding the desired conclusion. $\Box$

Proof of Theorem 1.2: Let $n\geq m$ . From (3) and (23), letting $V_{1}=\lfloor nU_{1}\rfloor$ ,

[TABLE]

We now construct a variable with the $W_{n}^{*}$ distribution. As $\textbf{U}_{1}$ and $\textbf{U}_{2}$ are equidistributed, $W_{n}^{\prime}$ given by the first equality in (26) when substituting $\textbf{U}_{2}$ in place of $\textbf{U}_{1}$ has law ${\cal L}(W_{n})$ . Hence, by (1) with $\theta=1$ , letting

[TABLE]

the pair $(W_{n},W_{n}^{*})$ is a coupling of a variable with the $W_{n}$ distribution to one with its Dickman ${\cal D}$ -bias distribution. Applying consequence (9) of Theorem 1.5, we obtain

[TABLE]

Letting

[TABLE]

in view of (26) and (27), and applying Lemma 2.5 to bound expectations of the form $E[C_{n,m}]$ and that $V_{1}\sim{\cal U}\{0,1,\ldots,n-1\}$ , we obtain

[TABLE]

To control $e_{n}$ , invoke the basic recursion (23) to write

[TABLE]

where

[TABLE]

and similarly,

[TABLE]

where

[TABLE]

and

[TABLE]

Taking the expectation of the absolute difference and using that $|u(n-1)-\lfloor nu\rfloor+1|\leq 2$ for all $u\in[0,1]$ , we obtain

[TABLE]

Lemmas 2.2 and 2.6 yield

[TABLE]

For the first remainder term $R_{1}$ , by Lemma 2.5, we have

[TABLE]

For $R_{2}$ , we condition on the event $\lfloor nU_{1}\rfloor=k$ for $1\leq k\leq n-1$ , then further on $\lfloor kU_{2}\rfloor=j$ for $0\leq j\leq k-1$ . We note the presence of $\lfloor nU_{1}\rfloor\geq m$ in the indicator restricts $k\geq m\geq 2$ in this second step, where the values of $j$ are all equally likely with probability $1/k$ . Applying Lemma 2.5 then yields

[TABLE]

As $R_{3}$ satisfies

[TABLE]

substituting the bounds (31)-(34) into (30) yields that, for all $n\geq m$ ,

[TABLE]

where the final equality follows by noting that $C(k;\textbf{U}_{1})=0$ for $k\leq m-1$ . Applying Lemma 2.1 yields that, for all $n\geq m$ ,

[TABLE]

and now from (29) we conclude

[TABLE]

Substitution into (28), and simplification, yields the claim. $\Box$

3 Proof of Theorem 1.5

Theorem 1.5 was originally proven using Stein’s method in [10], but [14] offered the following much simpler approach.

Proof: Let $U\sim{\cal U}[0,1]$ be independent of the pair $(W,D_{\theta})$ , which are constructed on the same space so as to achieve the infimum in (8). Then, as $D_{\theta}=_{d}D_{\theta}^{*}$ ,

[TABLE]

Now, by the triangle inequality,

[TABLE]

Rearranging the inequality yields the claimed bound. $\Box$

4 Proof of Theorem 1.3

We now apply Theorem 1.4 to prove Theorem 1.3.

Proof of Theorem 1.3. Since $f(x)=x$ is an element of ${\rm Lip}_{1}$ , expression (7) for the Wasserstein distance yields that

[TABLE]

applying (3) and that (see e.g. [12]) $E[D]=1$ .

Now, slightly rewriting the equality in (4) as

[TABLE]

for $m>2$ we have

[TABLE]

using $h_{n}>\log n$ . Hence, the claim of Theorem 1.3 holds for $m>2$ . We see the claim of Theorem also holds for $m=1$ by using the form (5), which yields $|E[C_{n,m}/n-2|=2h_{n}/n$ , noting that in this case $(m+2)h_{m}-3=0$ . $\Box$

Acknowledgement The author thanks Ralph Neininger for his vast simplification of the previous proof of Theorem 1.5 in the preprint [10], as well as for the suggestion for obtaining the lower bounds as achieved in Theorem 1.3. The author also sincerely thanks two reviewers whose suggestions and observations were extremely valuable, which included pointing out that Theorem 1.4 is a known result due to Knuth, and a simplification of Lemma 2.5.

Bibliography18

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Richard Arratia, Andrew Barbour, and Simon Tavaré, Logarithmic combinatorial structures: a probabilistic approach , EMS Monographs in Mathematics, European Mathematical Society (EMS), Zürich, 2003.
2[2] Ehsan Azmoodeh, Benjamin Arras, Guillaume Poly, and Yvik Swan, Distances between probability distributions via characteristic functions and biasing , ar Xiv preprint:1605.06819 (2016).
3[3] Andrew Barbour and Bruno Nietlispach, Approximation by the Dickman distribution and quasi-logarithmic combinatorial structures , Electron. J. Probab. 16 (2011), no. 29, 880–902.
4[4] Chinmoy Bhattacharjee and Larry Goldstein, Dickman approximation in simulation, summations and perpetuities , to appear in: Bernoulli, ar Xiv preprint:1706.08192 (2018).
5[5] Benjamin Dadoun and Ralph Neininger, A statistical view on exchanges in Quickselect , ANALCO 14—Meeting on Analytic Algorithmics and Combinatorics, SIAM, Philadelphia, PA, 2014, pp. 40–51.
6[6] Luc Devroye and Omar Fawzi, Simulating the Dickman distribution , Statist. Probab. Lett. 80 (2010), no. 3-4, 242–247.
7[7] Karl Dickman, On the frequency of numbers containing prime factors of a certain relative magnitude , Arkiv for matematik, astronomi och fysik 22 (1930), no. 10, 1–14.
8[8] James Fill and Svante Janson, Quicksort asymptotics , J. Algorithms 44 (2002), no. 1, 4–28, Analysis of algorithms.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Non-asymptotic distributional bounds for the Dickman approximation of the running time of the Quickselect algorithm

Abstract

1 Introduction

Theorem 1.1

Theorem 1.2

Theorem 1.3

Theorem 1.4** (Knuth [13])**

Theorem 1.5

2 The Quickselect Method and the Proofs of Theorems 1.1 and 1.2

2.1 Quickselect: the case m=1m=1m=1

Lemma 2.1

Lemma 2.2

Lemma 2.3

Lemma 2.4

2.2 Case of m≥2m\geq 2m≥2

Lemma 2.5

Lemma 2.6

3 Proof of Theorem 1.5

4 Proof of Theorem 1.3

Theorem 1.4 (Knuth [13])

2.1 Quickselect: the case $m=1$

2.2 Case of $m\geq 2$