Non asymptotic distributional bounds for the Dickman Approximation of the running time of the Quickselect algorithm
Larry Goldstein

TL;DR
This paper establishes non-asymptotic bounds on the distributional approximation of the Quickselect algorithm's running time using the Dickman distribution, providing explicit convergence rates and insights into the approximation's accuracy.
Contribution
It introduces a new Wasserstein distance bound for the Dickman approximation and applies it to derive explicit, non-asymptotic convergence rates for Quickselect's running time distribution.
Findings
Derived explicit bounds for the distributional distance between Quickselect's running time and the Dickman distribution.
Proved the rate of convergence is optimal for certain parameters, matching known asymptotic results.
Provided exact expressions and lower bounds for the expected running time of Quickselect.
Abstract
Given a non-negative random variable and , let the generalized Dickman transformation map the distribution of to that of where , a uniformly distributed variable on the unit interval, independent of , and where denotes equality in distribution. It is well known that and are equal in distribution if and only if has the generalized Dickman distribution . We demonstrate that the Wasserstein distance between , a non-negative random variable with finite mean, and having distribution obeys the inequality The specialization of this bound to the case and coupling constructions yield $$ d_1(W_{n,1},D) \le \frac{8\log (n/2)+10}{n} \quad \mbox{for all $n \ge 1$, where} \quadβ¦
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
00footnotetext: This work was partially supported by NSA grant H98230-15-1-0250.00footnotetext: MSC 2010 subject classifications: Primary 60F05, 68Q25 00footnotetext: Key words and phrases: sorting, complexity, distributional approximation
Non-asymptotic distributional bounds for the Dickman approximation of the running time of the Quickselect algorithm
Larry Goldstein
(Department of Mathematics, University of Southern California)
Abstract
Given a non-negative random variable and , let the generalized Dickman transformation map the distribution of to that of
[TABLE]
where , a uniformly distributed variable on the unit interval, independent of , and where denotes equality in distribution. It is well known that and are equal in distribution if and only if has the generalized Dickman distribution . We demonstrate that the Wasserstein distance between , a non-negative random variable with finite mean, and having distribution obeys the inequality
[TABLE]
The specialization of this bound to the case and coupling constructions yield
[TABLE]
and is the number of comparisons made by the Quickselect algorithm to find the smallest element of a list of distinct numbers. A similar bound holds for for , and together recover and quantify the results of [12] that show distributional convergence of to the standard Dickman distribution in the asymptotic regime . By comparison to an exact expression for the expected running time , lower bounds are provided that show the rate is not improvable for .
1 Introduction
For a given non-negative random variable and , let the generalized Dickman transformation map the distribution of to that of
[TABLE]
where has the uniform distribution on the unit interval, and is independent of and where denotes equality in distribution. It is well known [6], [15] that the generalized Dickman distribution is the unique fixed point of the transformation (1), that is,
[TABLE]
When (1) holds we will say that has the -bias distribution of . In what follows, will denote a random variable with distribution . The case corresponds to the (standard) Dickman distribution, for which we may drop the subscript .
The Dickman function first made its appearance in number theory [7] when counting the number of integers below a fixed threshold whose prime factors satisfy some given upper bound. Standardizing yields the density of the standard Dickman distribution, the cannonical member of the family of generalized Dickman distributions, which also arise in the study of component counts of logarithmic combinatorial structures such as permutations and partitions [1], and more generally for the quasi-logarithmic class considered in [3]. See also the recent work [17], [2] and [4] in this area, that detail some connections to probabilistic number theory.
Members from the generalized Dickman family have subsequently been noted to arise in a variety of other contexts, in particular for the sum of edge lengths of vertices connected to the origin in minimal directed spanning trees in [15], and for weighted sums of independent random variables in [16], [2] and [4]. Simulation of the Dickman distribution has been considered in [6].
Here we study the error incurred when using the standard Dickman distribution to approximate that of the (properly normalized) number of comparisons made by the Quickselect sorting algorithm of Hoare [11] for locating the smallest element of a list of distinct numbers. One may visualize how Quickselect works in terms of a tree structure. First, a βpivotβ is chosen uniformly from the given list. The list is then divided into those numbers on the list that are strictly smaller, making up the left subtree, and those that are strictly larger, making up the right. If the left subtree is of size then the pivot is the desired smallest element, and the procedure terminates. Otherwise, the process continues recursively on the left sub-tree if it is of size or larger, and else on the right sub-tree.
Letting
[TABLE]
where is the number of comparisons made by Quickselect, the work of [12] showed that converges in distribution to the Dickman when . We note that in the case Quickselect simplifies in that at each step of the recursion the procedure either stops or continues on the left subtree. As this case is simpler than for we deal with it separately.
The following two theorems quantify and recover the results of [12] by providing non-asypmptotic bounds in the Wasserstein distance between and that converge to zero in the asymptotic regime. As the smallest number of a list of distinct numbers only exists when , we need only consider this range of parameters in what follows.
Theorem 1.1
Let be the number of comparisons made by Quickselect to find the smallest of a list of distinct numbers, and let be given by (3). Then for all
[TABLE]
Theorem 1.2
Let and the number of comparisons made by Quickselect to find the smallest element of a list of distinct numbers, and let be given by (3). Then for all
[TABLE]
That the bounds in Theorems 1.1 and 1.2 are tight in the order for is a consequence of the following result; in the following, we let for .
Theorem 1.3
For all ,
[TABLE]
We note that in the case the lower bound simplifies to . That our method, where we focus only on the expectation to achieve our lower bound, does not succeed in the case is explained by the lack of the term on the right hand side of (6). Theorem 1.3 is shown using the following exact expression for the expected running time of Quickselect; see also Section 6 of [9].
Theorem 1.4** (Knuth [13])**
Let be the number of comparisons made by Quickselect to locate the smallest of distinct numbers. Then for all
[TABLE]
In particular,
[TABLE]
Theorems 1.1 and 1.2 are derived by applying Theorem 1.5 that quantifies the if direction of the fixed point property (2) in the Wasserstein, or metric between two random variables and , given by
[TABLE]
On the left hand side of (7) we have chosen to write , rather than the technically correct expression , only for notational convenience.
Theorem 1.5
Let be a non-negative random variable with finite mean, let , and let the law of be given by (1). Then
[TABLE]
As the Wasserstein distance also satisfies
[TABLE]
where the infimum is over all couplings having the given marginals, and is achieved here (see [18], for instance), Theorem 1.5 implies that
[TABLE]
for any non-negative random variable with finite mean, and defined on a common space having the -bias distribution of .
In Section 2 we detail the workings of the Quickselect algorithm and prove Theorems 1.1 and 1.2 by applying Theorem 1.5, which is proved in Section 3. The proof of Theorem 1.3 appears in Section 4.
In related work, [8] considers the Quicksort method, which produces a fully sorted list, and [5] obtains distributional bounds for the running time of a variation of Quickselect to a non-Dickman approximand; compare its characterizion in (1.4) there to (1) here.
2 The Quickselect Method and the Proofs of Theorems 1.1 and 1.2
In this section we apply Theorem 1.5 to obtain the bounds in Theorems 1.1 and 1.2 on the error of the Dickman approximation for the distribution of in (3), the properly normalized running time of the Quickselect algorithm for finding the smallest element of a list of distinct numbers. When the value of is clear from context, we will write for .
2.1 Quickselect: the case
In this section we prove Theorem 1.1 for the distribution of the number of comparisons that Quickselect requires to locate the smallest element of a list of distinct numbers. Clearly, a list of size zero requires no comparisons, hence . For , the procedure requires the comparisons of the pivot to every other element at the first stage, followed by the cost of processing the left subtree, which may be empty. Since the pivot is chosen uniformly, we obtain the stochastic recursion
[TABLE]
where , the size of the left subtree, is a discrete uniform variable on . From (10) we see that and a.s., and that non-trivial distributions arise for .
Before proceeding to the proof of the theorem we describe how for all we may write as a function with
[TABLE]
and a sequence of i.i.d.Β uniform variables on . Consider the initial list of size as making up the left subtree at stage 0. At stage , given a non-null left subtree from the previous stage of size , a new left subtree of size
[TABLE]
results by choosing a pivot uniformly from the current left subtree. In particular, the conditional distribution of given satisfies . Rewriting (10) in this notation we have
[TABLE]
As the size of each non-null left subtree decrements by at least one at each iteration, the value of will only depend on an initial subsequence of of length at most .
We pause to prove a lemma that is needed in this and the following section.
Lemma 2.1
If for a non-negative number and a positive integer
[TABLE]
then
[TABLE]
Proof: As (13) holds for we see that , verifying that the inequality in (14) holds at . Assuming inequality (13) holds for for some we have
[TABLE]
completing the inductive step, and the proof.
We now prove Theorem 1.1. In the proof, we use Lemmas 2.2 and 2.4, which appear with their proofs at end of this section.
Proof of Theorem 1.1: Take . With as in (11), by (12) the variable as given by (3) satisfies
[TABLE]
We now construct a variable with the distribution by first constructing having the distribution. As and are equidistributed,
[TABLE]
and hence
[TABLE]
has the -bias distribution by (1). The difference
[TABLE]
satisfies
[TABLE]
hence consequence (9) of Theorem 1.5 with yields
[TABLE]
We claim that
[TABLE]
When this inequality follows from using the basic recursion (12) on both terms forming the difference that defines , followed by applying the triangle inequality, and is easily verified to hold directly in the case by applying (12) only on the first term of that difference, noting the second one in this case is zero. Now using that for all , we obtain
[TABLE]
For the final term, the inequality
[TABLE]
follows by applying Lemma 2.2, below, that shows that a.s, and Lemma 2.4, also below, that shows that for all .
Expanding the expectation in in (16), using the fact that is uniformly distributed over and that by virtue of , we obtain
[TABLE]
As inequality (15) shows that the claim of the theorem holds for . Applying Lemma 2.1 with and shows that for , and substituting this bound into (15) and simplifying now completes the proof.
We now prove Lemmas 2.2 and 2.4.
Lemma 2.2
For all and ,
[TABLE]
Proof: Consider the case , as otherwise the claim is trivial. Let and , so that and
[TABLE]
Then
[TABLE]
Taking the difference,
[TABLE]
As the difference between and is less than 1, their integer parts can differ by at most 1.
To prove Lemma 2.4, we will use the easily verified fact that
[TABLE]
and for that
[TABLE]
We will also require the following inequality that can be shown directly using induction.
Lemma 2.3
If and
[TABLE]
then for all .
Lemma 2.4
For all
[TABLE]
Proof: As we need only consider . In view of (17) we may write
[TABLE]
We claim that the conditional expectation in the first sum is 1. Indeed, for the given range of the first case of (20) yields , and now (12) implies that on this event
[TABLE]
For the second sum, the second case of (20) yields , and
[TABLE]
Hence,
[TABLE]
Invoking Lemma 2.3 with now completes the proof.
2.2 Case of
In this section we prove Theorem 1.2 for the approximation of the distribution of the properly scaled value of the number of comparisons made by the Quickselect algorithm to determine the smallest element of a list of distinct numbers in the case .
As the smallest element of the list does not exist when , no comparisons are required and we may set over this range. In the non-trivial case , begins as for at the first stage by selecting a uniformly chosen pivot, giving rise, through comparisons to the pivot, to a left subtree of size , uniformly distributed over , and a right subtree of size . If then the smallest element of the original list lies in the left subtree, and we may locate it by applying to it. If then the pivot is the smallest element and the process stops. Otherwise , and the smallest element is the smallest element in the right subtree, which we then locate by applying to it. Hence, we obtain
[TABLE]
We now develop a simple bound on the expectation .
Lemma 2.5
Let be the number of Quickselect comparisons for locating the smallest element of a list of distinct numbers. Then for all ,
[TABLE]
Proof: Recall is the harmonic series for . The claim is trivial unless , and is also easily seen to be true for and using (5) and (6). Hence, we take .
For such and , writing the difference between the two harmonic series below as a sum and separating out the last term for , we have
[TABLE]
the inequality holding since each ratio is bounded by 1. Hence, using the expression given for in Theorem 4 and applying (22) to yield the first inequality below, we obtain the upper bound
[TABLE]
Note that the indicator on the first term on the right hand side of (21) may be dropped, due to the boundary condition there, on the line above. Now letting be defined by rewriting (21) as (12) was derived from (10), we obtain
[TABLE]
We next provide the following result that parallels Lemma 2.4 for the case .
Lemma 2.6
For all and
[TABLE]
Proof: As for all we may take . By the basic recursion (23) we have
[TABLE]
Applying the triangle inequality and taking expectation yields
[TABLE]
For the first expectation in (24), by (20) we have
[TABLE]
Now applying Lemma 2.5 on the first term of the remainder , and using that , yields
[TABLE]
and replacing by we see that the same bound holds for the expectation of the final term of .
Substituting the bounds achieved into (24) we obtain
[TABLE]
As for inequality (25) holds for all , and the conditions for invoking Lemma 2.3 with are satisfied, yielding the desired conclusion.
Proof of Theorem 1.2: Let . From (3) and (23), letting ,
[TABLE]
We now construct a variable with the distribution. As and are equidistributed, given by the first equality in (26) when substituting in place of has law . Hence, by (1) with , letting
[TABLE]
the pair is a coupling of a variable with the distribution to one with its Dickman -bias distribution. Applying consequence (9) of Theorem 1.5, we obtain
[TABLE]
Letting
[TABLE]
in view of (26) and (27), and applying Lemma 2.5 to bound expectations of the form and that , we obtain
[TABLE]
To control , invoke the basic recursion (23) to write
[TABLE]
where
[TABLE]
and similarly,
[TABLE]
where
[TABLE]
and
[TABLE]
Taking the expectation of the absolute difference and using that for all , we obtain
[TABLE]
[TABLE]
For the first remainder term , by Lemma 2.5, we have
[TABLE]
For , we condition on the event for , then further on for . We note the presence of in the indicator restricts in this second step, where the values of are all equally likely with probability . Applying Lemma 2.5 then yields
[TABLE]
As satisfies
[TABLE]
substituting the bounds (31)-(34) into (30) yields that, for all ,
[TABLE]
where the final equality follows by noting that for . Applying Lemma 2.1 yields that, for all ,
[TABLE]
and now from (29) we conclude
[TABLE]
Substitution into (28), and simplification, yields the claim.
3 Proof of Theorem 1.5
Theorem 1.5 was originally proven using Steinβs method in [10], but [14] offered the following much simpler approach.
Proof: Let be independent of the pair , which are constructed on the same space so as to achieve the infimum in (8). Then, as ,
[TABLE]
Now, by the triangle inequality,
[TABLE]
Rearranging the inequality yields the claimed bound.
4 Proof of Theorem 1.3
We now apply Theorem 1.4 to prove Theorem 1.3.
Proof of Theorem 1.3. Since is an element of , expression (7) for the Wasserstein distance yields that
[TABLE]
applying (3) and that (see e.g. [12]) .
Now, slightly rewriting the equality in (4) as
[TABLE]
for we have
[TABLE]
using . Hence, the claim of Theorem 1.3 holds for . We see the claim of Theorem also holds for by using the form (5), which yields , noting that in this case .
Acknowledgement The author thanks Ralph Neininger for his vast simplification of the previous proof of Theorem 1.5 in the preprint [10], as well as for the suggestion for obtaining the lower bounds as achieved in Theorem 1.3. The author also sincerely thanks two reviewers whose suggestions and observations were extremely valuable, which included pointing out that Theorem 1.4 is a known result due to Knuth, and a simplification of Lemma 2.5.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Richard Arratia, Andrew Barbour, and Simon TavarΓ©, Logarithmic combinatorial structures: a probabilistic approach , EMS Monographs in Mathematics, European Mathematical Society (EMS), ZΓΌrich, 2003.
- 2[2] Ehsan Azmoodeh, Benjamin Arras, Guillaume Poly, and Yvik Swan, Distances between probability distributions via characteristic functions and biasing , ar Xiv preprint:1605.06819 (2016).
- 3[3] Andrew Barbour and Bruno Nietlispach, Approximation by the Dickman distribution and quasi-logarithmic combinatorial structures , Electron. J. Probab. 16 (2011), no. 29, 880β902.
- 4[4] Chinmoy Bhattacharjee and Larry Goldstein, Dickman approximation in simulation, summations and perpetuities , to appear in: Bernoulli, ar Xiv preprint:1706.08192 (2018).
- 5[5] Benjamin Dadoun and Ralph Neininger, A statistical view on exchanges in Quickselect , ANALCO 14βMeeting on Analytic Algorithmics and Combinatorics, SIAM, Philadelphia, PA, 2014, pp. 40β51.
- 6[6] Luc Devroye and Omar Fawzi, Simulating the Dickman distribution , Statist. Probab. Lett. 80 (2010), no. 3-4, 242β247.
- 7[7] Karl Dickman, On the frequency of numbers containing prime factors of a certain relative magnitude , Arkiv for matematik, astronomi och fysik 22 (1930), no. 10, 1β14.
- 8[8] James Fill and Svante Janson, Quicksort asymptotics , J. Algorithms 44 (2002), no. 1, 4β28, Analysis of algorithms.
