Some new Stein operators for product distributions
Robert E. Gaunt, Guillaume Mijoule, Yvik Swan

TL;DR
This paper develops a general method for deriving Stein operators for the product of two independent random variables, extending previous work and applying to various distributions including normals and gamma, with insights into their complexity.
Contribution
It introduces a unified framework for Stein operators of product distributions, covering non-centered normals, gamma, and variance-gamma distributions, expanding prior results.
Findings
Derived Stein operators for products of independent variables.
Provided a simple derivation of the characteristic function for product of normals.
Explained the increased complexity of the PDF in non-centered cases.
Abstract
We provide a general result for finding Stein operators for the product of two independent random variables whose Stein operators satisfy a certain assumption, extending a recent result of Gaunt, Mijoule and Swan \cite{gms18}. This framework applies to non-centered normal and non-centered gamma random variables, as well as a general sub-family of the variance-gamma distributions. Curiously, there is an increase in complexity in the Stein operators for products of independent normals as one moves, for example, from centered to non-centered normals. As applications, we give a simple derivation of the characteristic function of the product of independent normals, and provide insight into why the probability density function of this distribution is much more complicated in the non-centered case than the centered case.
| Product | Stein operator (here we set ) |
|---|---|
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Some new Stein operators for product distributions
Robert E. Gauntlabel=e1][email protected]=e2 [[
url]www.foo.com
Guillaume Mijoulelabel=e2][email protected] [
Yvik Swanlabel=e2][email protected] [ [
[
[
Abstract
We provide a general result for finding Stein operators for the product of two independent random variables whose Stein operators satisfy a certain assumption, extending a recent result of Gaunt, Mijoule and Swan [13]. This framework applies to non-centered normal and non-centered gamma random variables, as well as a general sub-family of the variance-gamma distributions. Curiously, there is an increase in complexity in the Stein operators for products of independent normals as one moves, for example, from centered to non-centered normals. As applications, we give a simple derivation of the characteristic function of the product of independent normals, and provide insight into why the probability density function of this distribution is much more complicated in the non-centered case than the centered case.
60E15,
62E15,
Stein’s method,
Stein operators,
product distributions,
product of independent normal random variables,
keywords:
[class=MSC]
keywords:
\startlocaldefs\endlocaldefs
and
a]The University of Manchester b]INRIA Paris c]Université libre de Bruxelles
1 Introduction
In 1972, Charles Stein [25] introduced a powerful technique for deriving explicit bounds in normal approximations. Shortly after, in 1975, Louis Chen [5] adapted the method to the Poisson distribution, and since then Stein’s method has been extended to a wide variety of distributional approximations. For a given target distribution , the first step in the general procedure is to find a suitable operator acting on a class of functions such that for all , where the random variable has distribution . The operator is called the Stein operator, and for continuous distributions is typically a differential operator; for the distribution, the classical operator is . This leads to the Stein equation
[TABLE]
where is a real-valued function. If is well chosen then, for a given , the Stein equation (1.1) can be solved for , and the problem of estimating the proximity of the distribution of a random variable of interest to the distribution of the target random variable , as measured by , reduces to one of bounding . For a detailed account of the method we refer the reader to the monograph Stein [26].
In addition to the normal and Poisson distributions, Stein’s method has been adapted to many classical distributions, such as the exponential (Chatterjee, Fulman and Röllin [4]), gamma (Luk [18]) and Laplace (Pike and Ren [22]), as well as quite general families of distributions, such as the Pearson family (Schoutens [24]), variance-gamma distributions (Gaunt [8]) and a wide class of distributions satisfying a certain diffusive assumption (Döbler [7], Kusuoka and Tudor [15]); for an overview see Ley, Reinert and Swan [16]. As such, over the years, a number of techniques have been developed for finding Stein operators for a variety of distributions. These include the density method (Stein [26], Ley, Reinert and Swan [16], Ley and Swan [17], Mijoule, Reinert and Swan [19]), the generator method (Barbour [3], Götze [14]), the differential equation duality approach (Gaunt [10], Ley, Reinert and Swan [16]), and probability generating function and characteristic function based approaches of Upadhye, Čekanavičius and Vellaisamy [27] and Arras et al. [2]. The corpus of literature concerning Stein operators and their applications is now vast, and it continues growing at a steady pace. Stein operators provide handles on target distributions which are in some sense just as important and natural characteristics of a probability distribution as its moments, its moment generating function, its p.d.f., c.d.f. or even its characteristic function. Finding tractable Stein operators is thus, naturally, an important question.
In this paper we pursue the work begun in Gaunt [9] and Gaunt [11] concerning the following question : “given two independent random variables and with Stein operators and , can one find a Stein operator for ?” More specifically, the present paper is a complement (sequel) to our paper Gaunt, Mijoule and Swan [13] where we developed an algebraic technique for finding Stein operators for products of independent random variables with polynomial Stein operators satisfying a technical condition. Let , and be the identity operator. We say that the absolutely continuous variates and have polynomial Stein operators if they allow Stein operators of the form for some real numbers. The highest value of such that is called the order of the operator. In Gaunt, Mijoule and Swan [13] we provided a method for deriving operators under the technical assumption that (see Assumption 3 and Lemma 2.6 of Gaunt, Mijoule and Swan [13] for more details on this condition). For such random variables, Proposition 2.12 of Gaunt, Mijoule and Swan [13] gives a polynomial Stein operator for the product . A number of classical random variables have Stein operators which satisfy this assumption, such as the distribution with Stein operator , with others including the gamma, beta, and even some more exotic distributions such as the zero-mean symmetric variance-gamma distribution and PRR distribution of Pëkoz, Röllin and Ross [21]. However, some very natural densities do not satisfy the assumption. In fact, even the non-centered normal distribution does not satisfy this assumption, as its Stein operator instead satisfies . In Proposition 2.1, we shall address the natural problem of extending the result of Gaunt, Mijoule and Swan [13] to treat the product of two independent random variables satisfying this new assumption. Here we have only added one level of complexity in the operator; nevertheless, as we will see later on, it is sufficient to include the classical cases of non-centered normal and non-centered gamma, and a more general sub-family of the variance-gamma distributions. Also, as noted in Remark 2.3, the proof technique is novel and seems to be a useful addition to the toolkit for finding Stein operators.
The Stein operators for the products of independent normal random variables are particularly theoretically interesting, and we devote Section 3 to exploring some of their properties. For the case of two independent centered normals a second order Stein operator was obtained by Gaunt [9], whereas, rather curiously, we find a third order operator for the product of two i.i.d. normals, and a fourth order operator for the product of two independent general normals; see Table 1. It is an important and natural question to ask whether our operators have minimal order amongst all Stein operators with polynomial coefficients. We believe this is the case but are unable to prove it. However, in Section 3.1, we are able to provide a brute force approach for verifying this assertion for polynomial coefficients up to a particular order. This brute force approach is very general and in principle can be applied to any polynomial Stein operators. In Section 3.2, we prove that our Stein operators for products of independent normals characterise the distribution. We do this by appealing to a more general result, Proposition 3.2, which treats distributions that are determined by their moments.
For the Stein operator of Gaunt [9] for the product of two independent standard normal random variables, it was possible to solve the corresponding Stein equation and bound the derivatives of the solution. As a result, Gaunt [9] was able to derive explicit bounds for product normal approximations. However, it seems to be beyond the scope of existing techniques in the Stein’s method literature to solve and then bound the derivatives of the solution to our more complicated third and fourth order Stein equations for products of non-centered normals. It should be noted, though, that there is still great utility to Stein equations even when it is not possible to obtain bounds for the solution. For example, as has been demonstrated in several papers such as Nourdin, Peccati and Swan [20], Arras et al. [1] and Arras et al. [2], Stein operators can be used for comparison of probability distributions directly without solving Stein equations. We also stress that Stein operators are also of use in applications beyond proving approximation theorems; for example, in obtaining distributional properties (Gaunt [9], Gaunt [11], Gaunt, Mijoule and Swan [13]). Indeed, in Section 3.3, we use our Stein operators to obtain a simple derivation of the characteristic function of two independent normals, and also provide valuable insight into why there is a dramatic increase in complexity in the probability density function from the centered to non-centered case.
2 New Stein operators for product distributions
2.1 A general result
Throughout this paper, we shall make the following assumptions, which were also made in Gaunt, Mijoule and Swan [13]; we refer the reader to that paper for some remarks on these assumptions.
**Assumption ** 1). admits a smooth density with respect to the Lebesgue measure on ; this density is defined and non-vanishing on some (possibly unbounded) interval . 2). admits an operator acting on which contains the set of smooth functions with compact support .
Let be a real polynomial. Then it is easily proved (by checking it when is a monomial, then by linearity) that
[TABLE]
(recall the notations , and the identity operator from the introduction). Now, for , let . Simple computations show that (see Gaunt, Mijoule and Swan [13, Lemma 2.5]) and This implies that for any real polynomial ,
[TABLE]
Proposition 2.1**.**
Let and be i.i.d. with common Stein operator of the form
[TABLE]
for two real polynomials. Then, a Stein operator for is
[TABLE]
where, for ,
[TABLE]
Proof.
Let and . Denote . We have
[TABLE]
Similarly,
[TABLE]
Replace with in (2.4) and add up to (2.3) to get
[TABLE]
which is also, using (2.1),
[TABLE]
Now using (2.4) and conditioning, we can compute
[TABLE]
We also have
[TABLE]
Thus we obtain by (2.6)
[TABLE]
Apply (2.7) to and add up to (2.5) applied to to obtain
[TABLE]
Apply the preceding equation to and subtract to (2.7) applied to to get the result. ∎
The case that and are polynomials of degree one is important, as it is applicable to non-centered normal and non-centered gamma random variables, as well as a general sub-family of the variance-gamma distributions. To this end, let us define the operator . We note that the limit of as is ill-defined, but we do have (see Gaunt, Mijoule and Swan [13, Remark 2.3]).
Corollary 2.2**.**
Let and (if either or are set to , then we proceed as described above). Let be i.i.d. with common Stein operator
[TABLE]
Then, a Stein operator for is
[TABLE]
Proof.
Set and in (2.2). A calculation then verifies that (2.8) and (2.2) are equivalent operators in this case (up to a factor ). ∎
Remark 2.3**.**
The proof of Proposition 2.1 involves applying certain equations to test functions of the form , where is a linear differential operator. This allowed us to cancel terms to obtain (2.2). We consider this technique to be a useful addition to the toolkit for finding Stein operators. Indeed, this approach was recently used by Gaunt [12] to find Stein operators for the and , where is the -th Hermite polynomial and . In Section 2.2.4, we also use the technique to derive a Stein operator for the product of independent non-centered normals with different means.
Remark 2.4**.**
We attempted to generalise Proposition 2.1 so that and are no longer identically distributed, for which and have Stein operators of the form and . We were only able to find a Stein operator for the product under the very restrictive condition that . This Stein operator had the unusual feature of not being symmetric in and . In certain simple cases, we can, however, apply the proof technique of Proposition 2.1 to derive a Stein operator for the product of two non-identically distributed random variables; see Section 2.2.4.
Remark 2.5**.**
Note that, whilst the Stein operator for and in Proposition 2.1 satisfies the condition , the Stein operator (2.2) for their product satisfies . Thus, it is not possible to iterate Proposition 2.1 to find a Stein operator for product of three i.i.d. random variables. This is in contrast to the work of Gaunt, Mijoule and Swan [13] which was carried out under the assumption .
2.2 Examples
2.2.1 Product of non-centered
normals
Assume and are independent standard normal random variables. A Stein operator for (or ) is . Applying Corollary 2.2 with , and gives the following Stein operator for :
[TABLE]
(Here, and for the rest of this paper, we consider the unit variance case; the extension to general case follows from a straightforward rescaling and the resulting Stein operator for the product is given in Table 1.) Note that when the operator becomes
[TABLE]
Taking then yields , which we recognise as the product normal Stein operator that was obtained by Gaunt [9].
2.2.2 Product of non-centered gammas
Assume and are distributed as a , with p.d.f. , , and let . A Stein operator for (or ) is Corollary 2.2 applied with , , , yields the following fourth-order Stein operator for :
[TABLE]
Note also that when , this operator reduces to , which is the product gamma Stein operator of Gaunt [11] applied to instead of .
2.2.3 Product of variance-gamma random variables
The variance-gamma distribution with parameters , , , has p.d.f.
[TABLE]
, where , , is the modified Bessel function of the second kind. If a random variable has density (2.10) then we write . A Stein operator is given by (see Gaunt [8]). Applying Corollary 2.2 with , we get the following Stein operator for the product of two independent random variables:
[TABLE]
Note that when we have
[TABLE]
Defining by gives
[TABLE]
which is in agreement with the product variance-gamma Stein operator given in Section 3.2 of Gaunt, Mijoule and Swan [13]. Lastly, we note that the Stein operator of Gaunt [8], as given by
[TABLE]
satisfies when , and therefore one cannot apply Proposition 2.1 or Corollary 2.2 to find a Stein operator for the product of two such variates.
2.2.4 Product of non-identically distributed non-central normals
By working on a case-by-case basis it is possible to use the proof technique of Proposition 2.1 to find Stein operators for the product of two non-identically distributed random variables, whose Stein operators satisfy the assumptions of the proposition. We find that a Stein operator for the product of independent normals and is
[TABLE]
Let us now provide a derivation of this Stein operator. Let and be independent standard normal random variables and define . We will use repeatedly the fact that for , as well as conditioning arguments, and we let stand for the expectation conditioned on . Let be four times differentiable and such that for and for , where . Then
[TABLE]
By again applying a conditioning argument we obtain
[TABLE]
(and the same applies to ). Hence
[TABLE]
Isolating the expressions depending on from (2.12) and (2.13), we obtain two different equations:
[TABLE]
and
[TABLE]
Substract (2.15) to (2.2.4) to get
[TABLE]
from which we deduce that (2.11) is a Stein operator for .
Lastly, we note that applying the operator (2.9) to yields
[TABLE]
which we recognise as the Stein operator (2.11) in the special case .
2.2.5 Sums of products of normals
Let us begin by noting a simple result, that has perhaps surprisingly not previously been stated explicitly in the literature. Suppose are i.i.d., with Stein operator , where and the and are real-valued constants. Let Then, by conditioning,
[TABLE]
Thus, a Stein operator for is given by
[TABLE]
Remark 2.6**.**
Identity (2.16) actually generalises similar observations for score functions and Stein kernels, for which such an additive stability is well-known, see Nourdin, Peccati and Swan [20].
Since the coefficients in the Stein operators (2.9) and (2.11) are linear, we can use (2.16) to write down a Stein operator for the sum , where and are independent. When , we have
[TABLE]
and when and are not necessarily equal, we have
[TABLE]
When , the random variable follows the distribution (see Gaunt [8], Proposition 1.3). Taking in (2.17) gives , which we recognise as the Stein operator that was obtained in Gaunt [8].
3 Some results concerning the Stein operators for products of independent normal random variables
3.1 On the minimality of the operators
The operator (2.8) is at most a seventh order differential operator. However, for particular cases, such as the product of two i.i.d. non-centered normals, the operator reduces to one of lower order, see Section 2.2.1. Whilst we believe that the third order operator (2.9) is a minimal order polynomial operator, we have no proof of this claim (nor do we have much intuition as to whether the seventh order operator (2.8) is of minimal order). We believe this question of minimality to be important:
Conjecture 3.1**.**
There exists no second order Stein operator (acting on smooth functions with compact support) with polynomial coefficients for the product of two independent non-centered normal random variables.
One can use a brute force approach to verify the conjecture for polynomials of fixed order (if the conjecture holds). Such results would be worthwhile in practice, because a third order Stein operator with linear coefficients may be easier to use in applications than one of second order with polynomial coefficients of degree greater than one.
Let us now us the brute force approach to prove that there is no second order Stein operator with linear coefficients for the product of two independent non-centered normals (generalisations are obvious). Let and be independent random variables and let . Suppose that there was such a Stein operator for , then it would be of the form , where . Now, if was a Stein operator for , we would have for all in some class that contains the monomials . Taking , , we obtain six equations for six unknowns. Letting denote , we have , , , , and . This leads to the system of equations
[TABLE]
We used Mathematica to compute that the determinant of the matrix corresponding to this system of equations is . Therefore, there is a unique solution, which is clearly . Thus, there does not exist a second order Stein operator with linear coefficients for .
Similarly, one can show that there is no third order Stein operator with linear coefficients for the product of two independent normals with different means. Here we took and , and sought a Stein operator of the form . We then used the monomials , , to generate eight linear equations in eight unknowns, and found the determinate of the matrix corresponding to this system of equations to be .
3.2 Characterisation by the operators
We begin with a simple general result, which perhaps surprisingly has not previously been given in the literature. The proof technique has, however, appeared in the literature; see the proof of Lemma 5.2 of Ross [23] for case of the exponential distribution.
Proposition 3.2**.**
Suppose that the law of the random variable , supported on , is determined by its moments. Let the operator , where , act on a class of functions which contains all polynomial functions. Suppose is a Stein operator for : that is, for all ,
[TABLE]
Now, let , where the maxima and minima are taken over all such that . Suppose that the first moments of are equal to those of and that
[TABLE]
for all . Then has the same law as .
Proof.
We prove that all moments of are equal to those of . As the moments of determine its law, verifying this proves the Proposition. The monomials are contained in the class , so applying , , to (3.2) yields the recurrence
[TABLE]
where if and otherwise. We have that and we are given that for . We can then use forward substitution in (3.3) to (uniquely) obtain all moments of . Due to (3.1), for all , and so it follows by the above reasoning that
[TABLE]
But this is same recurrence relation as (3.3) and, since for , it follows that for all as well. ∎
If we have obtained a Stein operator for a random variable , then Proposition 3.2 tells us that the operator characterises the law of if is determined by its moments. This characterisation is weaker than those typically found in Stein’s method literature, as it involves moment conditions on the random variable . This is perhaps not surprising, because the characterisations given in the literature have mostly been found on a case-by-case basis, whereas ours applies to a wide class of distributions.
The distribution of the product of two independent normal distributions is determined by its moments, which can be seen from the existence of its moment generating function for all ; see Section 3.3.1. The following full characterisation of the distribution is thus immediate from Proposition 3.2.
Proposition 3.3**.**
(i) Let be a real-valued random variable whose first three moments are equal to that of the random variable , where and are independent. Then is equal in law to if and only if
[TABLE]
for all functions such that for , and for , where .
(ii) Now suppose that , and that the first two moments of are equal to those of . Then is equal in law to if and only if
[TABLE]
for all such that , , and , .
Proposition 3.2 can be used to prove that some other Stein operators given in the literature fully characterise the distribution. For example, the Stein operator for the product of independent Beta random variables of Gaunt [11] is characterising, since this product is supported on and thus the distribution is determined by its moments.
3.3 Applications of the operators
3.3.1 Characteristic function
As the Stein operator (2.11) has linear coefficients, it turns out to be straightforward to use the characterising equation (3.3) to find a formula for the characteristic function of the random variable , where and are independent.
On taking in the characterising equation (3.3) and setting , we deduce that satisfies the differential equation
[TABLE]
It should be noted that is a complex-valued function; here we have applied the characterising equation to the real and imaginary parts of , which are themselves real-valued functions. Solving (3.5) subject to the condition that then gives that
[TABLE]
Setting yields a formula for the moment generating function , which is well-defined for . We doubt these formulas are new, but it is interesting to note that we were able to obtain such a simple proof via the Stein characterisation.
3.3.2 Probability density function
Let and be independent, and let . For , it is a well-known and easy to prove result that the p.d.f. is given by , . However, in general, the p.d.f. takes a much more complicated form (see Cui et al. [6])): for ,
[TABLE]
It is possible to use the Stein operators for the product to gain insight into why there is such a dramatic increase in complexity from the zero mean case to non-zero mean case. To see this, we recall a duality result given in Remark 2.7 of Gaunt, Mijoule and Swan [13] (see also Section 4 of that paper for further details). If admits a smooth density , which solves the differential equation with , then a Stein operator for is given by , and similarly given a Stein operator for one can write down a differential equation satisfied by . In this manner, we can write down differential equations satisfied by the density of the random variable , where the and are independent copies of and respectively, using the Stein operators (2.17) and (2.18) for this distribution. When , we have
[TABLE]
and in general
[TABLE]
In the special case , the density of satisfies the modified Bessel differential equation .
From Section 3.1 and the duality result of Gaunt, Mijoule and Swan [13], we know there do not exist differential equations for with linear coefficients with a lower degree than (3.8) and (3.9). Moreover, we were unable to transform (3.8) or (3.9) into a well-understood class, such as the Meijer -function differential equation. Therefore, the increase in complexity in the p.d.f. of from the zero mean to non-zero mean case can be understood from the increase in complexity of the differential equation satisfied by . Also, due to the above reasoning, it seems plausible that formula (3.7) cannot be simplified further.
Finally, we note that there is not a severe increase in complexity in the differential equations satisfied by from the case to the general case. To the best of our knowledge, a formula for general has not been obtained in the literature, and even if the differential equations (3.8) and (3.9) are not ultimately used to derive such a formula, they do indicate that the formula should be at a similar level of complexity to that of (3.7), and thus provide motivation for obtaining such a formula. We note that such a result would be of interest due to the occurrence of such random variables in, for example, electrical engineering applications, see Ware and Lad [28].
Acknowledgements
RG acknowledges support from EPSRC grant EP/K032402/1 and is currently supported by a Dame Kathleen Ollerenshaw Research Fellowship. RG is grateful to Université de Liège, FNRS and EPSRC for funding a visit to University de Liège, where some of the details of this project were worked out. YS acknowledges support by the Fonds de la Recherche Scientifique - FNRS under Grant MIS F.4539.16. Part of GM’s research was supported by a Welcome Grant from Université de Liège. We would like to thank the referee for their careful reading of our paper and their helpful comments.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Arras, B., Azmoodeh, E., Poly, G. and Swan, Y. A bound on the 2-Wasserstein distance between linear combinations of independent random variables. Stochastic Processes and their Applications 𝟏𝟐𝟗 129 \mathbf{129} (2019), pp. 2341–2375.
- 2[2] Arras, B., Azmoodeh, E., Poly, G. and Swan, Y. Stein characterizations for linear combinations of gamma random variables. To appear in Brazilian Journal of Probability and Statistics , 2019+.
- 3[3] Barbour, A. D. Stein’s method for diffusion approximations. Probability Theory and Related Fields 𝟖𝟒 84 \mathbf{84} (1990), pp. 297–322.
- 4[4] Chatterjee, S., Fulman, J. and Röllin, A. Exponential approximation by Stein’s method and spectral graph theory. ALEA Latin American Journal of Probability and Mathematical Statistics 8 (2011) pp. 197-223.
- 5[5] Chen, L. H. Y. Poisson approximation for dependent trials. Annals of Probability 𝟑 3 \mathbf{3} (1975), pp. 534–545.
- 6[6] Cui, G., Yu, X. Iommelli, S. and Kong, L. Exact Distribution for the Product of Two Correlated Gaussian Random Variables. IEEE Signal Processing Letters 𝟐𝟑 23 \mathbf{23} (2016), pp. 1662–1666.
- 7[7] Döbler, C. Stein’s method of exchangeable pairs for the beta distribution and generalizations. Electronic Journal of Probability 𝟐𝟎 20 \mathbf{20} no . 109 (2015), pp. 1–34.
- 8[8] Gaunt, R. E. Variance-Gamma approximation via Stein’s method. Electronic Journal of Probability 𝟏𝟗 19 \mathbf{19} no. 38 (2014), pp. 1–33.
