Using Non-Linear Difference Equations to Study Quicksort Algorithms
Yukun Yao

TL;DR
This paper employs non-linear difference equations and symbolic computation to analyze the running times of various Quicksort algorithm variants, providing explicit formulas and experimental results.
Contribution
It introduces a novel approach using non-linear difference equations for detailed analysis of Quicksort variants, including explicit expectation and variance calculations.
Findings
Explicit expressions for expectations and variances of comparisons and swaps.
Monte Carlo experiments validate theoretical results.
Discussion of limiting distributions for some variants.
Abstract
Using non-linear difference equations, combined with symbolic computations, we make a detailed study of the running times of numerous variants of the celebrated Quicksort algorithms, where we consider the variants of single-pivot and multi-pivot Quicksort algorithms as discrete probability problems. With non-linear difference equations, recurrence relations and experimental mathematics techniques, explicit expressions for expectations, variances and even higher moments of their numbers of comparisons and swaps can be obtained. For some variants, Monte Carlo experiments are performed, the numerical results are demonstrated and the scaled limiting distribution is also discussed.
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Combinatorial Mathematics · Mathematical Dynamics and Fractals · Stochastic processes and statistical mechanics
Using Non-Linear Difference Equations to Study Quicksort Algorithms
Yukun Yao
Abstract
Using non-linear difference equations, combined with symbolic computations, we make a detailed study of the running times of numerous variants of the celebrated Quicksort algorithms, where we consider the variants of single-pivot and multi-pivot Quicksort algorithms as discrete probability problems. With non-linear difference equations, recurrence relations and experimental mathematics techniques, explicit expressions for expectations, variances and even higher moments of their numbers of comparisons and swaps can be obtained. For some variants, Monte Carlo experiments are performed, the numerical results are demonstrated and the scaled limiting distribution is also discussed.
Keywords: Experimental Mathematics, Quicksort Algorithms, Difference Equation, Recurrence Relation, Explicit Expression.
1 Introduction
A sorting algorithm is an algorithm that rearranges elements of a list in a certain order, the most frequently used orders being numerical order and lexicographical order. Sorting algorithms play a significant role in computer science since efficient sorting is important for optimizing the efficiency of other algorithms which require input data to be in sorted lists. In this paper, our focus is Quicksort.
Quicksort was developed by British computer scientist Tony Hoare in 1959 and published in 1961. It has been a commonly used algorithm for sorting since then and is still widely used in industry.
The main idea for Quicksort is that we choose a pivot randomly and then compare the other elements with the pivot, smaller elements being placed on the left side of the pivot and larger elements on the right side of the pivot. Then we recursively apply the same operation to the sublists obtained from the partition step. As for the specific implementations, there can be numerous variants, some of which are at least interesting from a theoretical perspective despite their rare use in the real world.
It is well-known that the worst-case performance of Quicksort is and the average performance is . However, we are also interested in the explicit closed-form expressions for the moments of Quicksort’s performance, i.e., running time, in terms of the number of comparisons and/or the number of swaps. In this paper, only lists or arrays containing distinct elements are considered.
The paper is organized as follows. In Section 2, we review related work on the number of comparisons of 1-pivot Quicksort, whose methodology is essential for further study. In Section 3, the numbers of swaps of several variants of 1-pivot Quicksort are considered. In Section 4, we extend our study to multi-pivot Quicksort. In Section 5, the technique to obtain more moments and the scaled limiting distribution are discussed. In the last section we discuss some potential improvements for Quicksort, summarize the main results of this paper and make final remarks on the methodology of experimental mathematics.
Accompanying Maple Package
This article is accompanied by Maple packages QuickSort.txt and Findrec.txt available from the front of this article
http://sites.math.rutgers.edu/~yao/QuickSort.html .
QuickSort.txt is the main package of this paper and all procedures mentioned in the paper are from this package unless noted otherwise. Findrec.txt is mainly used to find a recurrence relation, i.e., difference equation of moments from the empirical data.
2 Related Work
In the masterpiece of Shalosh B. Ekhad and Doron Zeilberger [3], they managed to find the explicit expressions for expectation, variance and higher moments of the number of comparisons of 1-pivot Quicksort with an experimental mathematics approach, which is also considered as some form of “machine learning.” Here we will review the results they discovered or rediscovered.
Let be the random variable “number of comparisons in Quicksort applied to lists of length ,” .
Theorem 2.1 ([12], p.8, end of section 1.3; [5], Eq. (2.14), p. 29, and other places)
[TABLE]
Here are the Harmonic numbers
[TABLE]
In following theorems, we introduce the notation
[TABLE]
Theorem 2.2 (Knuth, [11], answer to Ex. 8(b) in section 6.2.2))
[TABLE]
Its asymptotic expression is
[TABLE]
Theorem 2.3 (Zeilberger, [3]) The third moment about the mean of is
[TABLE]
It is asymptotic to
[TABLE]
[TABLE]
It follows that the limit of the scaled third moment (skewness) converges to
[TABLE]
Theorem 2.4 (Zeilberger, [3]) The fourth moment about the mean of is
[TABLE]
[TABLE]
[TABLE]
It is asymptotic to
[TABLE]
[TABLE]
[TABLE]
[TABLE]
It follows that the limit of the scaled fourth moment (kurtosis) converges to
[TABLE]
Results for higher moments, more precisely, up to the eighth moment, are also discovered and discussed by Shalosh B. Ekhad and Doron Zeilberger in [3].
Before this article, there are already human approaches to find the expectation and variance for the number of comparisons. Let . Since the pivot can be the -th smallest element in the list , we have the non-linear difference equation
[TABLE]
because the expected number of comparisons for the sublist before the pivot is and that for the sublist after the pivot is . From this difference equation, complicated human-generated manipulatorics is needed to rigorously derive the closed form. For the variance, the calculation is much more complicated. For higher moments, we doubt that human approach is realistic.
The experimental mathematics approach is more straightforward and more powerful. For the expectation, a list of data can be obtained through the difference equation and the initial condition. Then with an educated guess that is a polynomial of degree one in both and , i.e.,
[TABLE]
where are undetermined coefficients, we can solve for these coefficients by plugging sufficiently many and in this equation.
For higher moments, there is a similar difference equation for the probability generating function of . With the probability generating function, a list of data of any fixed moment can be obtained. Then with another appropriate educated guess of the form of the higher moments, the explicit expression follows.
In [3], it is already discussed that this experimental mathematics approach, which utilizes a difference equation to study the Quicksort algorithms, is actually rigorous by pointing out that this is a finite calculation and by referring to results in [16] and [17].
3 Number of Swaps of 1-Pivot Quicksort
The performance of Quicksort depends on the number of swaps and comparisons performed. In reality, a swap usually takes more computing resources than a comparison. The difficulty in studying the number of swaps is that the number of swaps depends on how we implement the Quicksort algorithm while the number of comparisons are the same despite the specific implementations.
Since only the number of comparisons is considered in [3], the Quicksort model in [3] is that one picks the pivot randomly, compares each non-pivot element with the pivot and then places them in one of the two new lists and where the former contains all elements smaller than the pivot and the latter contains those greater than the pivot. Under this model there is no swap, but a lot of memory is needed. For convenience, let’s call this model Variant Nulla.
In this section, we consider the random variable, the number of swaps , in different Quicksort variants. Some of them may not be efficient or widely used in industry; however, we treat them as an interesting problem and model in permutations and discrete mathematics. In the first subsection, we also demonstrate our experimental mathematics approaches step by step.
3.1 Variant I
The first variant is that we always choose the first (or equivalently, the last) element in the list of length as the pivot, then we compare the other elements with the pivot. We compare the second element with the pivot first. If it is greater than the pivot, it stays where it is, otherwise we put it before the pivot and all the other elements including the pivot shift one position to the right. Though this is somewhat different than the “traditional swap,” we define this operation as a swap. Generally, every time we find an element smaller than the pivot, we put it on the pivot’s current position and the indexes of the pivot and all elements on the right of pivot are incremented by one.
Hence, after comparisons and some number of swaps, the partition is achieved, i.e., all elements on the left of the pivot are smaller than the pivot and all elements on the right of the pivot are greater than the pivot. The difference between this variant and Variant Nulla is that this one does not need to create new lists so that it saves memory.
Let be the probability generating function for the number of swaps, i.e.,
[TABLE]
where for only finitely many integers , we have that is nonzero.
We have the difference equation
[TABLE]
with the initial condition because for any fixed , the probability that the pivot is the -th smallest is and there are exactly swaps when the pivot is the -th smallest.
The Maple procedure SwapPQs(n,t) in the package Quicksort.txt implements the recurrence of the probability generating function.
Recall that the -th moment is given in terms of the probability generating function
[TABLE]
The moment about the mean
[TABLE]
can be easily derived from the raw moments , using the binomial theorem and linearity of expectation. Another way to get the moment about the mean is by considering
[TABLE]
Recall that
[TABLE]
Our educated guess is that there exists a polynomial of variables such that
[TABLE]
With the Maple procedure QsMFn, we can easily obtain the following theorems by just entering QsMFn(SwapPQs, t, n, Hn, r) where represents the moment you are interested in. When , it returns the explicit expression for its mean rather than the trivial “first moment about the mean”.
Theorem 3.1.1 The expectation of the number of swaps of Quicksort for a list of length under Variant I is
[TABLE]
Theorem 3.1.2 The variance of is
[TABLE]
Theorem 3.1.3 The third moment about the mean of is
[TABLE]
Theorem 3.1.4 The fourth moment about the mean of is
[TABLE]
[TABLE]
[TABLE]
The explicit expressions for higher moments can be easily calculated automatically with the Maple procedure QsMFn and the interested readers are encouraged to find those formulas on their own.
3.2 Variant II
The second variant is similar to the first one. One tiny difference is that instead of choosing the first or last element as the pivot, the index of the pivot is chosen uniformly at random. For example, we choose the -th element, which is the -th smallest, as the pivot. Then we compare those non-pivot elements with the pivot. If , the first element will be compared with the pivot first. If it is smaller than the pivot, it stays there, otherwise it is moved to the end of the list and all other elements including the pivot shift one position to the left. After comparing all the left-side elements with the pivot, we look at those elements whose indexes are originally greater than . If they are greater than the pivot, no swap occurs; otherwise move them to the pivot’s current position and the indexes of the pivot and elements after the pivot but before the swapped element now are incremented by one.
In this case, the recurrence of the probability generating function is more complicated as a consequence of that the number of swaps given that and is known is still a random variable rather than a fixed number as the case in Variant I.
Let be the probability generating function for such a random variable. In fact, given a random permutation in the permutation group and that the -th element is , the number of swaps equals to the number of elements which are before and greater than or after and smaller than . Hence, if there are elements which are before and smaller than , then there are elements which are before and greater than and there are elements which are after and smaller than . So in this case the number of swaps is .
Then we need to determine the range of . Obviously it is at least 0. In total there are elements which are less than , at most of them occurring after , so . As for the upper bound, since there are only elements before , we have . Evidently, as well. Therefore the range of is .
As for the probability that there are exactly elements which are before and smaller than , it equals to
[TABLE]
Consequently, the probability generating function is
[TABLE]
which is implemented by the Maple procedure PerProb(n, k, i, t). For example, PerProb(9, 5, 5, t) returns
[TABLE]
We have the difference equation
[TABLE]
with the initial condition , which is implemented by the Maple procedure SwapPQ(n, t). The following theorems follow immediately.
Theorem 3.2.1 The expectation of the number of swaps of Quicksort for a list of length under Variant II is
[TABLE]
Theorem 3.2.2 The variance of is
[TABLE]
Theorem 3.2.3 The third moment about the mean of is
[TABLE]
Theorem 3.2.4 The fourth moment about the mean of is
[TABLE]
[TABLE]
[TABLE]
Higher moments can also be easily obtained by entering QsMFn(SwapPQ, t, n, Hn, r) where represents the -th moment you are interested in.
Comparing with Variant I where the index of the pivot is fixed, we find that these two variants have the same expected number of swaps. However, the variance and actually all even moments of the second variant are smaller. Considering that the average performance is already which is not far from the best scenario, it is favorable that a Quicksort algorithm has smaller variance. In conclusion, for this model, a randomly-chosen-index pivot can improve the performance of the algorithm.
3.3 Variant III
Next we’d like to study the most widely used in-place Quicksort. This algorithm is called Lomuto partition scheme, which is attributed to Nico Lomuto and popularized by Bentley in his book Programming Pearls and Cormen, et al. in their book Introduction to Algorithms. This scheme chooses a pivot that is typically the last element in the list. Two indexes, , the insertion index, and , the search index are maintained. The following is the pseudo code for this variant.
algorithm quicksort(A, s, t) **is
** if s < t **then
** p := partition(A, s, t)
quicksort(A, s, p - 1)
quicksort(A, p + 1, t)
algorithm partition(A, s, t) **is
** pivot := A[t]
i := s
for j := s to t - 1 **do
** if A[j] < pivot **then
** swap A[i] with A[j]
i := i + 1
swap A[i] with A[t]
return i
From the algorithm we can see that when the pivot is the -th smallest, there are elements smaller than the pivot. As a result, there are swaps in the if statement of the algorithm partition. Including the last swap outside the if statement, there are total swaps. We have the difference equation for its probability generating function
[TABLE]
with the initial condition .
Theorem 3.3.1 The expectation of the number of swaps of Quicksort for a list of length under Variant III
[TABLE]
Theorem 3.3.2 The variance of the number of swaps of Quicksort for a list of length under Variant III
[TABLE]
Note that for this variant, and some other ones in the remainder of the paper, to find out its explicit expression of moments, we may need to modify our educated guess to a “rational function” of and for some (see procedure QsMFnRat and QsMFnRatG). Moreover, sometimes when we solve the equations obtained by equalizing the guess with the empirical data, some initial terms should be disregarded since the increasing complexity of the definition of the Quicksort algorithms may lead to the “singularity” of the first several terms of moments. Usually, the higher the moment is, the more initial terms should be disregarded.
3.4 Variant IV
In Variant III, every time when A[j] < pivot, we swap A[i] with A[j]. However, it is a waste to swap them when . If we modify the algorithm such that a swap is performed only when the indexes , the expected cost will be reduced. Besides, if the pivot is actually the largest element, there is no need to swap A[i] with A[t] in the partition algorithm. To popularize Maple in mathematical and scientific research, we attach Maple code for the partition part here, the ParIP procedure in the package QuickSort.txt, in which Swap(S, i, j) is a subroutine to swap the -th element with the -th in a list .
ParIP:=proc(L) local pivot,i,j,n,S:
n:=nops(L):
pivot:=L[n]:
S:=L:
i:=1:
for j from 1 to n-1 do
if S[j]<=pivot then
if i<>j then
S:=Swap(S, i, j):
i:=i+1:
else
i:=i+1:
fi:
fi:
od:
if i<>n then
return Swap(S, i, n), i:
else
return S, i:
fi:
end:
Lemma 3.4.1 Let be the number of swaps needed in the first partition step in an in-place Quicksort without swapping the same index for a list of length when the pivot is the -th smallest element, then
[TABLE]
Proof.
When , it is obvious that for each search index , the condition S[j] <= pivot is satisfied, hence the insertion index is always equal to , which means there is no swap inside the loop. Since eventually , there is also no need to swap the pivot with itself. So the number of swaps is [math] in this case.
When , notice that the first time is smaller than is when we find the first element greater than the pivot. After that, will always be less than , which implies that for each element smaller than the pivot and the pivot itself, one swap is performed. ∎
The Maple procedure IPProb(n,k,t) takes inputs as defined above and a symbol , and outputs the probability generating function for the number of swaps in the first partition when the length of the list is and the last element, which is chosen as the pivot, is the -th smallest.
When , the probability that there are swaps is
[TABLE]
Therefore the probability generating function
[TABLE]
The difference equation for the probability generating function of the total number of swaps follows immediately:
[TABLE]
with the initial condition .
Theorem 3.4.2 The expectation of the number of swaps of Quicksort for a list of length under Variant IV
[TABLE]
Theorem 3.4.3 The variance of the number of swaps of Quicksort for a list of length under Variant IV
[TABLE]
Comparing these results with Theorem 3.1.1 and Theorem 3.2.1, it shows that Variant IV has better average performances, notwithstanding the “broader” definition of “swap” in the first two subsections. And of course, it is better than the in-place Quicksort which swaps the indexes even when they are the same. We fully believe that the additional cost to check whether the insertion and search indexes are the same is worthwhile.
3.5 Variant V
This variant might not be practical, but we find that it is interesting as a combinatorial model. As is well-known, if a middle-most element is chosen as a pivot, the Quicksort algorithm will have better performance than average in this case. Hence if additional information is available so that the probability distribution of chosen pivots is no longer a uniform distribution but something Gaussian-like, it is to our advantage.
Assume that the list is a permutation of and we are trying to sort it, pretending that we do not know the sorted list must be . Now the rule is that we choose the first and last number in the list, look at the numbers and choose the one which is closer to the middle. If the two numbers are equally close to the middle, then choose one at random.
Without loss of generality, we can always assume that the last element in the list is chosen as the pivot; otherwise we just need to run the algorithm in the last subsection “in reverse”, putting both the insertion and search indexes on the last element and letting larger elements be swapped to the left side of the pivot, etc. So the only difference with Variant IV is the probability distribution of , which is no longer for each
Considering symmetry, , so we only need to consider . When is even, let . Then . When is odd, let , then when and .
With this minor modification, the difference equation for follows.
[TABLE]
with the initial condition .
Though an explicit expression seems difficult in this case, we can still analyze the performance of the algorithm by evaluating its expected number of swaps. By exploiting the Maple procedure MomFn(f,t,m,N), which inputs a function name , a symbol , the order of the moment and the upper bound of the length of the list and outputs a list of -th moments for the Quicksort described by the probability generating function of lists of length , we find that Variant V has better average performance than Variant IV when is large enough. For example, MomFn(PQIP, t, 1, 20) returns
[TABLE]
[TABLE]
and MomFn(PQIPk, t, 1, 20) returns
[TABLE]
[TABLE]
We notice that for , Variant V consistently has better average performance. From this result we can conclude that it is worth choosing a pivot from two candidates since the gains of efficiency are far more than its cost.
Moreover, we can obtain a recurrence for the expected number of the random variable . The Findrec(f,n,N,MaxC) procedure in the Maple package Findrec.txt inputs a list of data , two symbols and , where is the discrete variable, and is the shift operator, and which is the maximum degree plus order. Findrec(MomFn(PQIPk, t, 1, 80), n, N, 11) returns
As previously mentioned, is the shift operator. Since this formula is too long, to see its detailed interpretation, please look at Theorem 4.2.1 as reference.
We can also look at more elements to choose the middle-most one as the pivot. In case that we do not want to store so much information, some other variants involving a “look twice” idea could be that if the first selected element is within some satisfactory interval, e.g., for a permutation of , then it is chosen as the pivot. Otherwise we choose a second element as the pivot without storing information about the first one. It is also likely to improve the algorithm with “multiple looks” to choose the pivot and the requirement to choose the current element as the pivot without continuing to look at the next one could vary and ideally the requirement should be more relaxed as the number of “looks” increases. We also refer the readers to [13] where P. Kirschenhofer, H. Prodinger and C. Martinez chose the median of a random sample of three elements as the pivot and obtained explicit expressions for this variant’s performance with methods of hypergeometric differential equations. In general, it is ideal to have a probability distribution of pivots where an element closer to the median is more likely to be chosen.
4 Explorations for Multi-Pivot Quicksort
With the same general idea as the 1-pivot Quicksort, it is natural to think about “what if we choose more pivots.” In -pivot Quicksort, indexes are chosen and the correspondent elements become pivots. The pivots need to be sorted and then the other elements will be partitioned into one of the sublists. Compared to 1-pivot Quicksort, multi-pivot Quicksort is much more complicated because we need to determine how to sort the pivots, how to allocate each non-pivot element to the sublist they belong to and how to sort a sublist containing less than elements. Therefore, there are a few multi-pivot Quicksort variants. We refer the reader to [9] for other versions of multi-pivot. When is large, it seems difficult to have an in-place version, so we mainly consider the random variable, the number of comparisons , in this section since a swap might be difficult to define in this case.
4.1 Dual-Pivot Quicksort
Let’s start from the simplest multi-pivot Quicksort: dual-pivot. The model for dual-pivot Quicksort is that two pivots and are randomly chosen. After one comparison, they are sorted, say . Non-pivot elements should be partitioned into one of the three sublists and . contains elements smaller than , contains elements between and while contains elements greater than . Each non-pivot element is compared with first. If it is smaller than , we are done. Otherwise it is compared with to determine whether it should go to or .
Given that the list contains distinct elements and the two pivots are the -th and -th smallest element , we need one comparison to sort the pivot and comparisons to distribute non-pivot elements to the sublists. Hence the total number of comparison is and the difference equation for the probability generating function of the total number of comparisons of dual-pivot Quicksort is
[TABLE]
with the initial condition and .
The above recurrence is implemented by the procedure PQc2. With the aforementioned Maple procedure QsMFn it is easy to get the following theorems.
Theorem 4.1.1 The expectation of the number of comparisons in dual-pivot Quicksort algorithms is
[TABLE]
Theorem 4.1.2 The variance of the number of comparisons in dual-pivot Quicksort algorithms is
[TABLE]
Theorem 4.1.3 The third moment about the mean of is
[TABLE]
Theorem 4.1.4 The fourth moment about the mean of is
[TABLE]
[TABLE]
[TABLE]
As usual, any higher moment can be easily obtained with a powerful silicon servant. Careful readers may notice that the above four theorems are exactly the same as the ones in Section 2. It is natural to ask whether they indeed have the same probability distribution. The answer is yes. In Section 4.1.2 of [8] the author gives a sketch of proof showing that 1-pivot and dual-pivot Quicksorts’s numbers of comparisons satisfy the same recurrence relation and initial condition. From an experimental mathematical point of view, a semi-rigorous proof obtained by checking sufficiently many special cases is good enough for us. For example, the first 10 probability generating function, for can be calculated in a nanosecond by entering [seq(PQc2(i, t), i = 1..10)] and we have
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
which are exactly the same as the probability generating function for the number of comparisons of 1-pivot Quicksort.
In conclusion, in terms of probability distribution of the number of comparisons, the dual-pivot Quicksort does not appear to be better than the ordinary 1-pivot Quicksort.
As for the random variable , the number of swaps, it depends on the specific implementation of the algorithm and the definition of a “swap.” As a toy model, we do an analogue of Section 3.1. Assume the list is a permutation of . The first and last elements are chosen as the pivot. Let’s say they are and . If then we swap them and still call the smaller pivot . For each element less than , we move it to the left of , and for each element greater than , we move it to the right of and call this kind of operations a swap.
The difference equation for the probability generating function of is
[TABLE]
with the initial conditions and
Theorem 4.1.5 The expectation of the number of swaps in the above dual-pivot Quicksort variant is
[TABLE]
Note that this result is better than those in Sections 3.1 and 3.2.
4.2 Three-Pivot Quicksort
As mentioned at the beginning of this section, to define a 3-pivot Quicksort, we need to define 1) how to sort the pivots, 2) how to partition the list and 3) how to sort a list or sublist containing less than 3 pivots. Since this paper is to study Quicksort, we choose 1-pivot Quicksort for 1) and 3). For 2), it seems that the binary search is a good option since for each non-pivot element exactly 2 comparisons with the pivots are needed.
The Maple procedure PQs(n,t) outputs the probability generating function for 1-pivot Quicksort of a list of length . Hence during the process of sorting the pivots and partitioning the list, the probability generating function of the number of comparisons is , which equals to
[TABLE]
So the difference equation for the probability generating function of the total number of comparisons for 3-pivot Quicksort of a list of length is
[TABLE]
with initial conditions and . This recurrence is implemented by the procedure PQd3.
The explicit expression seems to be difficult to obtain in this case, but numerical tests imply that 3-pivot Quicksort has better performances than dual-pivot, and of course 1-pivot since it is indeed equivalent to dual-pivot.
By exploiting the Maple procedure MomFn(f,t,m,N) again, we can compare the expectations of different Quicksort variants.
For instance, MomFn(PQc2, t, 1, 20) returns
[TABLE]
[TABLE]
and MomFn(PQd3, t, 1, 20) returns
[TABLE]
[TABLE]
We notice that for each fixed , 3-pivot Quicksort’s average performance is better than 2-pivot and 1-pivot. This numerical test is also possible for all previous Quicksort variants but seems unnecessary when the explicit expressions are easily accessible.
As in Section 3.5, a difference equation of the expected number of comparisons can be obtained. Findrec(MomFn(PQd3, t, 1, 40),n,N,8) returns
[TABLE]
[TABLE]
which leads to the following theorem.
Theorem 4.2.1 The expected number of comparisons of 3-pivot Quicksort for a list of length satisfies the following difference equation:
[TABLE]
[TABLE]
The difference equations for higher moments are also obtainable, but a long enough list of data is needed.
4.3 -Pivot Quicksort
More generally, -pivot Quicksort can be considered with the convention that 1) the pivots are sorted with 1-pivot Quicksort, 2) binary search is used to partition the list into sublists, 3) we use 1-pivot Quicksort to sort lists with length less than .
In the package QuickSort.txt the procedure PQck(n, t, k) outputs the probability generating function for the number of comparisons of -pivot Quicksort where each element is compared with pivots in a linearly increasing order. Obviously this is not efficient when is large. However, the problem for binary search is that when for some , it is hard to get an expression for the number of comparisons in the binary search step since the number highly depends on the specific implementation where some boundary cases may vary and the floor and ceiling functions will be involved, which leads to an increasing difficulty to find the explicit expressions for moments.
There is a procedure QsBC(L, k) which inputs a list of distinct numbers and an integer representing the number of pivots and outputs the sorted list and the total number of comparisons. For convenience of Monte Carlo experiments, we use MCQsBC(n, k, T) where is the length of the list, is the number of pivots and is the number of times we repeat the experiments. Because of the limit of computing resources, we only test for and
[TABLE]
[TABLE]
[TABLE]
[TABLE]
Our observation is that for large enough , the more pivots we use, the less comparisons are needed. However, when is too close to , the increase of pivots may lead to inefficiency.
5 Limiting Distribution
The main purpose of this paper is to find explicit expressions for the moments of the number of swaps or comparisons of some variants of Quicksort, to compare their performances and to explore more efficient Quicksort algorithms. However, it is also of interest to find more moments for large and calculate their floating number approximation of the scaled limiting distribution.
As mentioned in [3], if we are only interested in the first few moments, then it is wasteful to compute the full probability generating function Let and use the fact that
[TABLE]
where are the factorial moments. The straight moments , and the moments-about-the-mean, follow immediately.
As a case study, let’s use Variant IV from Section 3.4 as an example. The difference equation is
[TABLE]
where is as defined in Section 3.4.
Since only the first several factorial moments are considered, in each step truncation is performed and only the first several coefficients in is kept. With this method we can get more moments in a fixed time. The procedure TrunIP implements the truncated factorial generating function.
With the closed-form expressions for both the expectation, , and the variance , the scaled random variable is defined as follows.
[TABLE]
We are interested in the floating point approximations of the limiting distribution . Of course its expectation is [math] and its variance is .
For instance, if we’d like to know the moments up to order 10, TrunIP(100, z, 10) returns
[TABLE]
[TABLE]
[TABLE]
[TABLE]
For , the coefficient of times is the -th factorial moment. By
[TABLE]
where the curly braces denote Stirling numbers of the second kind, we can get the raw moments. And with the procedure MtoA a sequence of raw moments are transformed to moments about the mean. Divided by , the 3rd through 10th moments in floating point approximations are
[TABLE]
[TABLE]
The same technique can be applied to other variants of Quicksort in this paper and we leave this to interested readers.
6 Future Work and Final Remarks
In such a rich and active research area as Quicksort, there are still several things we could think about to improve the algorithms’ performances. Just to name a few, in 2-pivot Quicksort when we compare non-pivot elements with the pivots to determine which sublist they belong to, if the history is tracked, we might be able to use the history to determine which pivot to compare with first for the next element. The optimal strategy would vary with the additional information about the range of the numbers or the relative ranking of the two pivots among all the elements.
As for -pivot Quicksort, our naive approach only distinguishes two cases: whether the currently to-be-sorted list has length less than or not. If the length is less than , we use 1-pivot Quicksort; otherwise we still choose pivots. However, we might be able to improve the performance if the number of pivots varies according to the length of the to-be-sorted list or sublist. Let’s say there is a function , where is the length of the list. So we pick pivots at the beginning. After we obtain the sublists with length , for each one of them, we choose pivots. It would be interesting whether we can find an optimal in terms of its average performance. Additionally, when is large, it might make sense to use multi-pivot Quicksort to sort the pivots as well.
Of course, it is also interesting to study the explicit expressions of the numbers of swaps in multi-pivot Quicksort. But it appears to be dependent on the specific implementation of the algorithm so it is of significance to look for variants which save time and space complexity.
The main results of this paper are those explicit expressions of moments and difference equations for either the number of comparisons or the number of swaps of various Quicksort variants. Though all of their asymptotics are , the constant before this term varies a lot and some comparisons of these variants are also discussed. When there is difficulty getting the explicit expressions, numerical tests and Monte Carlo experiments are performed. We also have a demonstration on how to get more moments and find the numeric approximation of the scaled limiting distribution.
Nevertheless, more important than those results is the illustration of a methodology of experimental mathematics. From ansatzes and sufficient data we have an alternative way to obtain results that otherwise might be extremely difficult or even impossible to get via traditional human approaches to algorithm analysis.
Acknowledgment I would like to express my special thanks of gratitude to my advisor Doron Zeilberger for his mentoring and guidance. I am also grateful to Lun Zhang for interesting conversations and useful remarks. Many thanks are due to Vasileios Iliopoulos for his helpful clarifications and suggestions.
References
[1] Michael Cramer, A note concerning the limit distribution of the quicksort algorithm, Informatique Theériques et Applications, 30, 195-207, 1996.
[2] Marianne Durand, Asymptotic analysis of an optimized Quicksort algorithm, Inform. Proc. Lett., 85 (2): 73-77, 2003
[3] Shalosh B. Ekhad and Doron Zeilberger, A Detailed Analysis of Quicksort Running Time, The Personal Journal of Shalosh B. Ekhad and Doron Zeilberger, http://www.math.rutgers.edu/ zeilberg/pj.html . Direct url:
http://sites.math.rutgers.edu/ zeilberg/mamarim/mamarimhtml/qsort.html. Also in: https://arxiv.org/abs/1903.03708 .
[4] Shalosh B. Ekhad and Doron Zeilberger, Explicit Expressions for the Variance and Higher Moments of the Size of a Simultaneous Core Partition and its Limiting Distribution, The Personal Journal of Shalosh B. Ekhad and Doron Zeilberger, http://www.math.rutgers.edu/ zeilberg/pj.html . Direct url:
http://www.math.rutgers.edu/ zeilberg/mamarim/mamarimhtml/stcore.html. Also in: https://arxiv.org/abs/1508.07637 .
[5] Ronald L. Graham, Donald E. Knuth, and Oren Patashnik, “Concrete Mathematics”, Addison-Wesley, 1989.
[6] P. Hennequin, Combinatorial analysis of quicksort algorithm, RAIRO Theor. Inform. Appl., 23 (3): 317-333, 1989
[7] C.A.R. Hoare, Quicksort, The Computer Journal, 5:1, 10-15, 1962.
[8] Vasileios Iliopoulos, The Quicksort algorithm and related topics, https://arxiv.org/abs/1503.02504.
[9] Vasileios Iliopoulos, A note on multipivot Quicksort, Journal of Information and Optimization Sciences, 39:5, 1139-1147, 2018.
[10] Vasileios Iliopoulos and David B. Penman, Dual pivot Quicksort, Discrete Mathematics, Algorithms and Applications, Vol. 04, No. 03, 1250041, 2012.
[11] Donald E. Knuth, The Art of Computer Programming, Volume 3: Sorting and Searching, Addison-Wesley, 1973.
[12] Manuel Kauers and Peter Paule, The Concrete Tetrahedron, Springer, 2011.
[13] P. Kirschenhofer, H. Prodinger, C. Martinez, Analysis of Hoare’s FIND algorithm with Median-of-three partition, Random Structures and Algorithms, 10:1-2, 143-156, 1997.
[14] Charles Knessl and Wojciech Szpankowski, Quicksort algorithm again revisited, Discrete Mathematics and Theoretical Computer Science, 3, 43-64, 1999.
[15] Alois Panholzer, Analysis of multiple quickselect variants, Theor. Comput. Sci., 302 (1-3): 45-91, 2003
[16] Carsten Schneider, The Summation package Sigma, A Mathematica package available from https://www3.risc.jku.at/research/combinat/software/Sigma/index.php
[17] Carsten Schneider, Symbolic Summation Assists Combinatorics, Sem.Lothar.Combin. 56(2007),Article B56b (36 pages). https://www3.risc.jku.at/research/combinat/software/Sigma/pub/SLC06.pdf
[18] Wikipedia, , Quicksort, https://en.wikipedia.org/wiki/Quicksort
[19] Doron Zeilberger, A Holonomic Systems Approach To Special Functions, J. Computational and Applied Math 32 (1990), 321-368, http://www.math.rutgers.edu/~zeilberg/mamarim/mamarimPDF/holonomic.pdf.
[20] Doron Zeilberger, The Automatic Central Limit Theorems Generator (and Much More!), in: “Advances in Combinatorial Mathematics: Proceedings of the Waterloo Workshop in Computer Algebra 2008 in honor of Georgy P. Egorychev”, chapter 8, pp. 165-174, (I.Kotsireas, E.Zima, eds. Springer Verlag, 2009), http://www.math.rutgers.edu/~zeilberg/mamarim/mamarimhtml/georgy.html.
[21] Doron Zeilberger, HISTABRUT: A Maple Package for Symbol-Crunching in Probability theory, the Personal Journal of Shalosh B. Ekhad and Doron Zeilberger, posted Aug. 25, 2010, http://www.math.rutgers.edu/~zeilberg/mamarim/mamarimhtml/histabrut.html.
[22] Doron Zeilberger, Symbolic Moment Calculus I.: Foundations and Permutation Pattern Statistics, Annals of Combinatorics 8, 369-378, 2004 Available from: http://www.math.rutgers.edu/~zeilberg/mamarim/mamarimhtml/smcI.html.
Yukun Yao, Department of Mathematics, Rutgers University (New Brunswick), Hill Center-Busch Campus, 110 Frelinghuysen Rd., Piscataway, NJ 08854-8019, USA. Email: yao at math dot rutgers dot edu .
