Generalized Rank Dirichlet Distributions
David Itkin

TL;DR
This paper introduces the Generalized Rank Dirichlet (GRD) distributions, a new family of distributions on the ordered simplex that generalize the Dirichlet distribution, allowing for negative parameters and providing explicit moments and simulation methods.
Contribution
The paper defines GRD distributions on the ordered simplex, derives explicit moments, and develops exact and approximate simulation algorithms, expanding modeling capabilities for ranked data.
Findings
Explicit moments for GRD distributions across dimensions.
Series representations and simulation algorithms for the distributions.
Application potential in financial modeling and ranked statistics.
Abstract
We study a new parametric family of distributions on the ordered simplex , which we call Generalized Rank Dirichlet (GRD) distributions. Their density is proportional to for a parameter satisfying for . The density is similar to the Dirichlet distribution, but is defined on , leading to different properties. In particular, certain components can be negative. Random variables with GRD distributions have previously been used to model capital distribution in financial markets and more generally can be used to model ranked order statistics of weight vectors. We obtain for any dimension explicit expressions for moments of order $M \in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Distribution Estimation and Applications · Bayesian Methods and Mixture Models · Financial Risk and Volatility Modeling
Generalized Rank Dirichlet Distributions
David Itkin111Department of Mathematics, Imperial College London, [email protected]
Abstract
We study a new parametric family of distributions on the ordered simplex , which we call Generalized Rank Dirichlet (GRD) distributions. Their density is proportional to for a parameter satisfying for . The density is similar to the Dirichlet distribution, but is defined on , leading to different properties. In particular, certain components can be negative. Random variables with GRD distributions have previously been used to model capital distribution in financial markets and more generally can be used to model ranked order statistics of weight vectors. We obtain for any dimension explicit expressions for moments of order for the ’s and moments of all orders for the log gaps when . Additionally, we propose an algorithm to exactly simulate random variates in this case. In the general case we obtain series representations for these quantities and provide an approximate simulation algorithm.
Keywords:
Generalized Rank Dirichlet Distribution, Dirichlet Distribution, Poisson–Dirichlet Distribution, Exponential Distribution, Ordered Simplex, Ranked Weights.
MSC 2020 Classification:
Primary 60E05; Secondary 62G30
1 Introduction
For an integer we study a parametric family of distributions defined on the ordered simplex
[TABLE]
whose density is proportional to
[TABLE]
for a parameter . It was shown in [6] (and reproduced below in Proposition 1) that this density induces a probability measure on , when appropriately normalized, if and only if
[TABLE]
Notably, condition (2) allows for certain ’s to be negative as long as the tail sum remains positive. In fact, even parameters satisfying for can be compatible with condition (2) (as long as is sufficiently positive).
In the special case , if then the ranked vector of decreasing order statistics has density proportional to (1). In the case that the components of are not all the same this relationship is no longer true. However, since the functional form of (1) is the same as for the Dirichlet density – just defined on the ordered simplex rather than the standard simplex – we call the induced probability distribution the generalized ranked Dirichlet distribution with parameter , or GRD() for short.
The GRD distribution can be used to model the distribution of ranked weight vectors that sum to one even for a general parameter. Indeed, if is a random (unordered) vector of nonnegative weights that sum to one with density proportional to then the decreasing order statistics follow a GRD() distribution.
To the best of the author’s knowledge the general form of the GRD() distribution under the condition (2) first appeared as the invariant density of a certain stochastic process, called a rank Jacobi process in [6]. Previously, the special case with had appeared in [1, 8, 4, 3], where it arose as the invariant measure to a class of processes known as Atlas or first-order models. In particular, in [1], a connection to independent exponential random variables via the log gaps (see equation (3) below) was established. The analysis in this paper heavily exploits this relationship to exponential random variables in the case to study GRD() distributions for more general parameters .
Arguably, the most well-studied distribution that models ranked weights is the Poisson–Dirichlet (PD) distribution introduced by Kingman in [7]. Indeed, it has found applications in a large number of fields including population genetics, number theory, physics, finance and statistics (see [9, 2] for detailed accounts of the PD distribution). However, it is defined on the infinite dimensional Kingman simplex and as such is an infinite-dimensional distribution. In the author’s PhD thesis [5], it was shown that, under appropriate assumptions on the parameter vector, the GRD distribution converges as to a distribution on the Kingman simplex which is absolutely continuous with respect to a PD distribution with an explicitly given density. As such, the GRD family can be viewed as a finite dimensional relative of the PD distribution.
Remarkably, even in the most basic case , the GRD distribution does not in general seem to be a standard probability distribution with a previously recorded name. When we can write and reduce to a one-dimensional random variable , which has density proportional to
[TABLE]
When this coincides with a truncated Beta distribution, but the case does not seem to have an established name.
Nevertheless, this distribution has remarkable structural properties. In Section 2 we formally define the GRD distribution. Under the condition the aforementioned relationship to independent exponential distributions is explored in Section 3, which we use to obtain negative moments of all orders for the largest weight . In Section 4 we then obtain a change of measure identity which establishes a relationship between GRD distributions with different parameters. In Section 5 we explore the case for some positive integer . In this case the change of measure formula can be leveraged to obtain explicit expressions for the positive moments of the ’s up to order , which are derived in Section 5.1. In particular, when , the moment formula is invertible with respect to the parameter vector allowing for explicit first moment matching. Additionally, it is shown in Section 5.3 that the log gaps
[TABLE]
can be represented as a mixture of exponential random variables when . This leads us to explicit formulas for the moment generating function and moments of all orders for the log gaps. Using the log gaps as an intermediary, in Section 5.4, we derive an algorithm to simulate exactly from the GRD() distribution in the case . The general case when is not assumed to be a negative integer is studied in Section 6. In this case we obtain a series representation for moments of the log gaps and leverage this to propose an approximate simulation algorithm to generate GRD() random variates.
Notation.
The tail sum notation of , as in (2), is in force throughout the paper. We write for the standard basis vectors in . We denote by the natural numbers (starting from one) and . For an integer we define . By convention, empty sums are taken to be zero, while empty products are taken to be one. Since is a -dimensional subset of , all integrals over should be understood as the pushforward of Lebesgue measure on under the map .
2 The GRD Distribution
Given we set . Then we have the following result already established in [6]. The proof is short and insightful so we reproduce it here.
Proposition 1** (Finite normalizing constant).**
* if and only if for .*
- Proof.
First note that the size or sign of does not effect integrability of since . Hence we assume without loss of generality that . Then we rewrite the integral as
[TABLE]
Next consider the change of variables for . This transformation maps the ordered simplex onto and its Jacobian is determined by . Thus we obtain
[TABLE]
This expression is finite if and only if for every completing the proof. ∎
This leads us to the standing assumption mentioned in the introduction.
Assumption 2**.**
The parameter vector satisfies for .
We can now formally define the GRD distribution.
Definition 3** (Generalized Rank Dirichlet (GRD) Distribution).**
For a parameter satisfying Assumption 2 the probability measure
[TABLE]
is called a Generalized Rank Dirichlet (GRD) distribution with paremeter . We will write for a random variable with law and denote by expectation under .
3 The case
An important special case of interest is when . In this case a similar calculation as in the proof of Proposition 1 shows that the log gaps given by (3) are distributed as independent exponentially distributed random variables whenever , and consequently, the weight ratios follow a Pareto distribution. Moreover, the normalizing constant is explicitly computable in this case. To the best of the author’s knowledge the Pareto property was first observed in [3] and the relationship to independent exponential random variables was explored in [1]. We collect these results in the following proposition.
Proposition 4** (Section 4 in [1]).**
When we have that . Additionally the log gaps are independent and satisfy , while the ratios are independent and satisfy for .
These facts can be leveraged to compute certain expected ratios and negative moments of .
Theorem 5**.**
Let satisfying Assumption 2 be given and suppose that .
- (i)
(Moments of ratios) Let and such that be given. Then
[TABLE] 2. (ii)
(Negative moments of ) For any ,
[TABLE]
- Proof.
First we assume that . In this case note that the expectation on the left hand side of (4) is given by . Since we obtain
[TABLE]
by Proposition 4, which proves (i) in this case.
To prove (i) in the general case we use the multinomial formula to obtain
[TABLE]
In the last equality we used (6), which is applicable since . Finally (5) follows by taking in (4). ∎
4 A change of measure formula
We now derive a change of measure identity, which holds for any GRD distribution. This identity is the workhorse for the computations to come.
Theorem 6** (Change of measure).**
Fix satisfying Assumption 2. Let be a function that is integrable under . Then
[TABLE]
- Proof.
We see that
[TABLE]
where in the intermediate equality we multiplied and divided by ∎
As we saw in Section 3, the case when the sum of the parameters is zero is particularly tractable. Thus a canonical choice for the vector in the change of measure formula is , in which case . Under this choice (7) becomes
[TABLE]
5 The case
5.1 Moments of the ’s
Remarkably, the identities for the negative moments of when can be used to derive positive moments, up to order , for a GRD() distribution when . This is the content of the next theorem.
Theorem 7** (Moment formulas for ).**
Suppose that satisfies Assumption 2 and that for some . Then for any with we have that
[TABLE]
- Proof.
This follows directly by taking in (8) and invoking Theorem 5 to compute the right hand side of (8). ∎
When this formula takes a particularly simple form
[TABLE]
In particular this formula is invertible, which allows for explicit first moment matching, which can be used to calibrate the parameters to data.
Corollary 8** (First moment matching).**
Let satisfying be given. Define via
[TABLE]
Then satisfies Assumption 2, and for .
- Proof.
This is readily verified by applying (9) to this choice of . ∎
5.2 An improved change of measure formula
In the case that for some , the denominator of (8) is explicitly computable courtesy of Theorem 5. By writing we can also expand the numerator to obtain that
[TABLE]
where the intermediate equality followed from Theorem 6 (with taken to be and taken to be in the notation of the theorem), while the final equality followed from Theorem 5(i) since . This leads us to the following improved change of measure formula.
Theorem 9** (Change of measure v2).**
Let satisfying Assumption 2 be given and suppose that for some . Then we have that
[TABLE]
for any -integrable function .
Since the ’s appearing in (11) are positive weights which sum to one, Theorem 9 establishes that can be explicitly represented as a mixture of GRD distributions with parameters that sum to zero. This relationship can be leveraged to obtain certain moment formulas for the weights and log gaps, which are explored in the sections below. Additionally, marginal distributions for the weights under the GRD() distribution can be studied with this change of measure identity as well, though we do not pursue this direction in detail here.
5.3 The log gaps as a mixture of exponential random variables
The change of measure formula of Theorem 9 is particularly insightful when we consider the log gaps for . Indeed, since is a function of , we readily obtain the following corollary to Theorem 9.
Corollary 10** (Change of measure for log gaps).**
Let satisfying Assumption 2 be given and suppose that for some . For any function such that is -integrable we have
[TABLE]
where is defined in (11). In particular the the log gaps under are a mixture of independent exponential random vectors.
- Proof.
The formula (12) is a direct consequence of Theorem 9, while the claim regarding the mixture of independent exponential distributions follows from Proposition 4 and the fact that for every . ∎
As an application of Corollary 10 we obtain the moment generating function and moments of the log gaps.
Corollary 11** (Log gap moments).**
Let satisfying Assumption 2 be given and suppose that for some . Set , which is explicitly given by (5) since . Then
- (i)
the moment generating function of the log gaps is given by
[TABLE] 2. (ii)
for any we have that
[TABLE]
- Proof.
This follows directly from Corollary 10 and known formulas for exponential random variables. ∎
5.4 Generation of random variates
We finish Section 5 by discussing a way to simulate a random vector following a distribution when . This can be done by first simulating the log gap random vector under using the relationship in Corollary 10 and then inverting the maps . To carry this out we define a random variable on via . The simulation steps are then as follows
This ensures that . We note that the presentation of the algorithm above is simply pseudocode and the implementation can be made more efficient by vectorizing the operations.
6 The General Case
In the case that the change of measure formula can still be used to study the GRD distributions. Indeed, by applying Newton’s generalized binomial theorem we can obtain a series representation for arbitrary in the case .
Proposition 12** (Expected powers of ).**
Let satisfying Assumption 2 be given and suppose that . Then for any we have
[TABLE]
- Proof.
We write . Note that since we have that . Hence, applying Newton’s binomial theorem and taking expectation yields
[TABLE]
Now applying the standard binomial theorem to the term inside the expectation and using the identity derived in Theorem 5(ii) completes the proof. ∎
We now combine this with the change of measure formula to obtain the following theorem.
Theorem 13** (Change of measure series representation).**
Let satisfying Assumption 2 be given and suppose that for some . Then for any -integrable function we have that
[TABLE]
where
[TABLE]
and is given explicitly by (13).
- Proof.
From the change of measure identity (8) we have that
[TABLE]
The denominator has the series representation given by Proposition 12. To handle the numerator we use Newton’s binomial theorem to expand out as before, multiply both sides by and take expectation to obtain
[TABLE]
where we used the standard binomial theorem in the final equality. Proceeding as in (10) we obtain
[TABLE]
Plugging this into (15) completes the proof. ∎
The upshot of this theorem is that we can represent an arbitrary GRD() distribution as a countable mixture of GRD distributions where the parameter vectors sum to zero. Applying this to the log gap process as in Section 5.3 shows, in turn, that the log gaps under an arbitrary GRD() distribution are a countable mixture of independent exponential random variables. This leads to series representation formulas for the log generating function and moments of the log gaps.
Corollary 14** (Log gap moments series representation).**
Let satisfying Assumption 2 be given. Then
- (i)
the moment generating function of the log gaps is given by
[TABLE] 2. (ii)
for any we have that
[TABLE]
where is defined in the statement of Theorem 13.
Moreover, the representation of as a countable mixture of independent exponential random variables suggests an approximate algorithm for generating random GRD() variates for arbitrary parameter by truncating the series appearing in (14). If we keep the first terms in the series then by rearranging the terms in the sum we obtain from (14) that
[TABLE]
where
[TABLE]
Consequently, if we define the random variable on the discrete set via
[TABLE]
then we obtain an algorithm to approximately sample from the GRD() distribution for arbitrary parameter .
7 Conclusion
We introduced the family GRD() of distributions on the ordered simplex . We established change of measure formulas that relate GRD() distributions with different parameters to each other. In the case that for some we exploited the change of measure identity to show that such a distribution is a (finite) mixture of GRD distributions with parameters that sum to zero. This, together with the fact that the log gaps are independent exponential random variables when the parameters sum to zero, was used to establish moment formulas, up to order , for the weights as well as for moments of all orders for the log gaps. This led to an algorithm which allows one to exactly sample the weights . In the case , the first moment formula is invertible allowing for explicit moment matching which can be used for calibration to data. In the general case when , we were able to recover many of the same properties, but under series representations rather than finite sums. This led us to an algorithm for approximately sampling the weights in this case.
Acknowledgements.
I am grateful to Martin Larsson for helpful discussions.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Adrian D. Banner, Robert Fernholz, and Ioannis Karatzas. Atlas models of equity markets. Ann. Appl. Probab. , 15(4):2296–2330, 2005.
- 2[2] Shui Feng. The Poisson-Dirichlet distribution and related topics . Probability and its Applications (New York). Springer, Heidelberg, 2010. Models and asymptotic behaviors.
- 3[3] E. Robert Fernholz. Stochastic portfolio theory , volume 48 of Applications of Mathematics (New York) . Springer-Verlag, New York, 2002. Stochastic Modelling and Applied Probability.
- 4[4] Tomoyuki Ichiba, Vassilios Papathanakos, Adrian Banner, Ioannis Karatzas, and Robert Fernholz. Hybrid Atlas models. Ann. Appl. Probab. , 21(2):609–644, 2011.
- 5[5] David Itkin. Growth Optimization in Stochastic Portfolio Theory with Applications to Robust Finance and Open Markets . Ph D thesis, Carnegie Mellon University, 2022.
- 6[6] David Itkin and Martin Larsson. Open markets and hybrid Jacobi processes. ar Xiv preprint ar Xiv:2110.14046 , 2021.
- 7[7] John FC Kingman. Random discrete distributions. Journal of the Royal Statistical Society: Series B (Methodological) , 37(1):1–15, 1975.
- 8[8] Soumik Pal and Jim Pitman. One-dimensional Brownian particle systems with rank-dependent drifts. Ann. Appl. Probab. , 18(6):2179–2207, 2008.
