Stein's method and the distribution of the product of zero mean   correlated normal random variables

Robert E. Gaunt

arXiv:1906.04785·math.ST·April 13, 2021

Stein's method and the distribution of the product of zero mean correlated normal random variables

Robert E. Gaunt

PDF

TL;DR

This paper introduces a new proof using Stein's method to derive the distribution of the product of two zero-mean correlated normal variables, providing a clearer approach to a classical problem.

Contribution

It presents a novel application of Stein's method to derive the distribution of the product of correlated normal variables, offering a simpler proof and methodological insights.

Findings

01

New proof of the distribution formula

02

Methodology applicable to related problems

03

Enhanced understanding of product distribution

Abstract

Over the last 80 years there has been much interest in the problem of finding an explicit formula for the probability density function of two zero mean correlated normal random variables. Motivated by this historical interest, we use a recent technique from the Stein's method literature to obtain a simple new proof, which also serves as an exposition of a general method that may be useful in related problems.

Equations31

p(x)=\frac{n^{(n+1)/2}2^{(1-n)/2}|x|^{(n-1)/2}}{(\sigma_{X}\sigma_{Y})^{(n+1)/2}\sqrt{\pi(1-\rho^{2})}\Gamma\big{(}\frac{n}{2}\big{)}}\exp\bigg{(}\frac{\rho nx}{\sigma_{X}\sigma_{Y}(1-\rho^{2})}\bigg{)}K_{\frac{n-1}{2}}\bigg{(}\frac{n|x|}{\sigma_{X}\sigma_{Y}(1-\rho^{2})}\bigg{)},

p(x)=\frac{n^{(n+1)/2}2^{(1-n)/2}|x|^{(n-1)/2}}{(\sigma_{X}\sigma_{Y})^{(n+1)/2}\sqrt{\pi(1-\rho^{2})}\Gamma\big{(}\frac{n}{2}\big{)}}\exp\bigg{(}\frac{\rho nx}{\sigma_{X}\sigma_{Y}(1-\rho^{2})}\bigg{)}K_{\frac{n-1}{2}}\bigg{(}\frac{n|x|}{\sigma_{X}\sigma_{Y}(1-\rho^{2})}\bigg{)},

E [σ^{2} g^{'} (X) - (X - μ) g (X)] = 0

E [σ^{2} g^{'} (X) - (X - μ) g (X)] = 0

E [(1 - ρ^{2}) \overline{Z} f^{''} (\overline{Z}) + \frac{1}{n} ((1 - ρ^{2}) + 2 ρ \overline{Z}) f^{'} (\overline{Z}) + (ρ - \overline{Z}) f (\overline{Z})] = 0.

E [(1 - ρ^{2}) \overline{Z} f^{''} (\overline{Z}) + \frac{1}{n} ((1 - ρ^{2}) + 2 ρ \overline{Z}) f^{'} (\overline{Z}) + (ρ - \overline{Z}) f (\overline{Z})] = 0.

E [Z f (Z)]

E [Z f (Z)]

= E [E [(1 - ρ^{2}) X^{2} f^{'} (Z) + ρ X^{2} f (Z) ∣ X]]

= E [(1 - ρ^{2}) X^{2} f^{'} (Z) + ρ X^{2} f (Z)] .

E [Z f (Z)]

E [Z f (Z)]

= E [E [(1 - ρ^{2}) Z f^{''} (Z) + (1 - ρ^{2}) f^{'} (Z) + ρZ f^{'} (Z) + ρ f (Z)

+ (1 - ρ^{2}) ρ X^{2} f^{''} (Z) + ρ^{2} X^{2} f^{'} (Z) ∣ V]]

= E [(1 - ρ^{2}) Z f^{''} (Z) + (1 - ρ^{2}) f^{'} (Z) + ρZ f^{'} (Z) + ρ f (Z)

+ (1 - ρ^{2}) ρ X^{2} f^{''} (Z) + ρ^{2} X^{2} f^{'} (Z)]

= E [(1 - ρ^{2}) Z f^{''} (Z) + (1 - ρ^{2} + 2 ρZ) f^{'} (Z) + ρ f (Z)],

E [(W - n ρ) f (W)]

E [(W - n ρ) f (W)]

= i = 1 \sum n E [E [(1 - ρ^{2}) Z_{i} f^{''} (W) + (1 - ρ^{2} + 2 ρ Z_{i}) f^{'} (W) ∣ {Z_{j}}_{j \neq = i}]]

= E [(1 - ρ^{2}) W f^{''} (W) + (n (1 - ρ^{2}) + 2 ρ W) f^{'} (W)],

(1 - ρ^{2}) x p^{''} (x) - \frac{1}{n} ((1 - ρ^{2}) + 2 ρ x) p^{'} (x) + (ρ - x) p (x) = 0.

(1 - ρ^{2}) x p^{''} (x) - \frac{1}{n} ((1 - ρ^{2}) + 2 ρ x) p^{'} (x) + (ρ - x) p (x) = 0.

\displaystyle\int_{-\infty}^{\infty}\big{\{}(1-\rho^{2})xp^{\prime\prime}(x)-\tfrac{1}{n}((1-\rho^{2})+2\rho x)p^{\prime}(x)+(\rho-x)p(x)\big{\}}f(x)\,\mathrm{d}x=0.

\displaystyle\int_{-\infty}^{\infty}\big{\{}(1-\rho^{2})xp^{\prime\prime}(x)-\tfrac{1}{n}((1-\rho^{2})+2\rho x)p^{\prime}(x)+(\rho-x)p(x)\big{\}}f(x)\,\mathrm{d}x=0.

p (x)

p (x)

x^{2} h^{''} (x) + x h^{'} (x) - (x^{2} + ν^{2}) h (x) = 0

x^{2} h^{''} (x) + x h^{'} (x) - (x^{2} + ν^{2}) h (x) = 0

\int_{- \infty}^{\infty} e^{β x} ∣ x ∣^{ν} K_{ν} (∣ x ∣) d x = \frac{π Γ ( ν + 1/2 ) 2 ^{ν}}{( 1 - β ^{2} ) ^{ν + 1/2}}, ν > - \frac{1}{2}, - 1 < β < 1,

\int_{- \infty}^{\infty} e^{β x} ∣ x ∣^{ν} K_{ν} (∣ x ∣) d x = \frac{π Γ ( ν + 1/2 ) 2 ^{ν}}{( 1 - β ^{2} ) ^{ν + 1/2}}, ν > - \frac{1}{2}, - 1 < β < 1,

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Stein’s method and the distribution of the product of zero mean correlated normal random variables

Robert E. Gaunt111School of Mathematics, The University of Manchester, Manchester M13 9PL, UK, [email protected]

Abstract

Over the last 80 years there has been much interest in the problem of finding an explicit formula for the probability density function of two zero mean correlated normal random variables. Motivated by this historical interest, we use a recent technique from the Stein’s method literature to obtain a simple new proof, which also serves as an exposition of a general method that may be useful in related problems.

Keywords: Product of correlated normal random variables; probability density function; Stein’s method

AMS 2010 Subject Classification: Primary 60E05; 62E15

1 Introduction

Let $(X,Y)$ be a bivariate normal random vector with zero mean vector, variances $(\sigma_{X}^{2},\sigma_{Y}^{2})$ and correlation coefficient $\rho$ . The exact distribution of the product $Z=XY$ has been studied since 1936 (Craig 1936); with contributions including the works of Aroian [2], Aroian, Taneja and Cornwell [3], Bandi and Connaughton [5], Haldane [16], Meeker et al. [21]; see Nadarajah and Pogány [22] for an overview of these and further contributions. The distribution of $Z$ has been used in numerous applications since 1936, with some recent examples being: product confidence limits for indirect effects (MacKinnon et al. [20]); statistics of Lagrangian power in two-dimensional turbulence (Bandi and Connaughton [5]); statistical mediation analysis (MacKinnon [19]) and electrical engineering (Ware and Lad [27]). Ware and Lad [27] have also provided three different methods of numerical integration for computing the probability that $Z$ , and more generally sums of independent variates with the same distribution as $Z$ , take a negative value. However, despite this interest, the problem of finding an exact formula for the probability density function (PDF) of $Z$ remained open for many years.

Recently in 2016, some 80 years after the problem was first studied, Nadarajah and Pogány [22] used an approach based on characteristic functions to obtain an explicit formula for the PDF of $Z$ . As a by-product, an explicit formula was obtained for the PDF of the mean $\overline{Z}=\frac{1}{n}(Z_{1}+\cdots+Z_{n})$ , where $Z_{1},\ldots,Z_{n}$ are independent and identical copies of $Z$ : for $n\geq 1$ variates,

[TABLE]

$x\in\mathbb{R}$ , where $K_{\nu}(x)=\int_{0}^{\infty}\mathrm{e}^{-x\cosh(t)}\cosh(\nu t)\,\mathrm{d}t$ is a modified Bessel function of the second kind. Of course, on setting $n=1$ we recover the formula for the product $Z$ .

Since the work of Nadarajah and Pogány [22], the distributions of $Z$ and $\overline{Z}$ were identified as variance-gamma random variables by Gaunt [11], from which a formula for the PDFs was immediate. Also, an exact formula for the PDF of a product of correlated normal random variables with non-zero means was recently obtained by Cui et al. [7]. This formula takes a complicated form, involving a double sum of modified Bessel functions of the second kind.

In this note, we use a recent technique from the Stein’s method literature to obtain a new derivation of the PDF of $\overline{Z}$ . This was in part motivated by the historical interest in this problem, but it also serves as a useful exposition of a neat technique for finding PDFs. The proof of Gaunt [11] is very simple but relies on identifying $\overline{Z}$ as a variance-gamma random variable and then exploiting results from the distributional theory of such random variables. The advantage of the approach given in this paper over that of Gaunt [11] is that we do not need to appeal to such a theory, and are able to give a simple self-contained proof. Also, our proof gives a transparent explanation as to why the modified Bessel function $K_{\nu}(x)$ occurs in the density in that our approach involves finding a second order differential equation satisfied by the PDF that closely resembles the modified Bessel differential equation. This is not so clear from the proof of Gaunt [11], nor that of Nadarajah and Pogány [22] which involves an application of the Fourier inversion formula and the evaluation of certain complex integrals in terms of $K_{\nu}(x)$ .

Introduced in 1972, Stein’s method (Stein [26]) is a powerful technique for deriving distributional approximations in probability theory. At the heart of the method is a Stein characterisation of the target distribution. For the normal distribution: $X\sim N(\mu,\sigma^{2})$ if and only if

[TABLE]

for all differentiable $g:\mathbb{R}\rightarrow\mathbb{R}$ such that $\mathbb{E}|g^{\prime}(Y)|$ , $\mathbb{E}|Yg(Y)|$ and $\mathbb{E}|g(Y)|$ are all finite for $Y\sim N(\mu,\sigma^{2})$ . We note that necessity is obtained almost immediately from the differential equation $\sigma^{2}\phi(x)^{\prime}+(x-\mu)\phi(x)=0$ that the $N(\mu,\sigma^{2})$ PDF $\phi(x)$ solves: just multiply through by $g(x)$ , integrate over $\mathbb{R}$ and then integrate by parts. Over the years, Stein characterisations have been obtained for many classical probability distributions (for an overview see Gaunt, Mijoule and Swan [12] and Ley, Reinert and Swan [17]), and also recently for more exotic distributions, such as linear combinations of gamma random variables (Arras et al. [4]) and products of independent normal, beta and gamma random variables (Gaunt [9, 10]), for which it is difficult to write down a formula for the PDF of the distribution.

Stein characterisations of probability distributions are most commonly used as part of Stein’s method to derive distributional approximations, with powerful applications in random graph and network theory (Franceschetti and Meester [8]), convergence rates in classical asymptotic results in statistics (Anastasiou and Reinert [1], Gaunt, Pickett and Reinert [14]), Bayesian statistics (Ley, Reinert and Swan [18]) and statistical learning and inference (Gorham et al. [15]); see the survey Ross [24] for a list of further application areas. However, recently Gaunt [10] and Gaunt, Mijoule and Swan [12] have found a novel application for Stein characterisations, in which they are used to establish formulas for PDFs of distributions that are too difficult to obtain via other methods. The basic approach, which we shall employ in this note, is to obtain a Stein characterisation of $\overline{Z}$ and then apply integration by parts to the characterising equation to deduce an ordinary differential equation (ODE) that the PDF must satisfy, from which we easily obtain the formula (1.1) for the density.

2 Proof of (1.1) via a Stein characterisation of the distribution

Here, we provide an alternative proof of the main results of Nadarajah and Pogány [22]. Throughout, we shall set $\sigma_{X}^{2}=\sigma_{Y}^{2}=1$ ; the extension to the general case is straightforward. The starting point is the following Stein characterisation of $\overline{Z}$ . Often in the Stein’s method literature a full characterisation of distributions is given, as given for the normal distribution in (1.2). However, for our purposes we only require necessity.

Proposition 2.1.

Suppose $f:\mathbb{R}\rightarrow\mathbb{R}$ is twice differentiable with $\mathbb{E}|f(\overline{Z})|$ , $\mathbb{E}|\overline{Z}f(\overline{Z})|$ , $\mathbb{E}|f^{\prime}(\overline{Z})|$ , $\mathbb{E}|\overline{Z}f^{\prime}(\overline{Z})|$ and $\mathbb{E}|\overline{Z}f^{\prime\prime}(\overline{Z})|$ all finite. Then

[TABLE]

Proof.

We first establish the result for $n=1$ before extending to $n\geq 1$ . Define the random variable $V$ by $V=(Y-\rho X)/\sqrt{1-\rho^{2}}$ , which is readily seen to be standard normally distributed and independent of $X$ . Then, we can write $Z=\sqrt{1-\rho^{2}}VX+\rho X^{2}$ . Therefore, $Z\,|\,X\sim N(\rho X^{2},(1-\rho^{2})X^{2})$ , and we obtain from (1.2) that

[TABLE]

Let $z=\sqrt{1-\rho^{2}}vx+\rho x^{2}$ , and note that $\frac{\partial}{\partial x}\big{(}xf(z)\big{)}=(z+\rho x^{2})f^{\prime}(z)+f(z)$ . Then, on using the Stein characterisation of the normal distribution (1.2) with $\mu=0$ , $\sigma^{2}=1$ and $g(x)=(1-\rho^{2})xf^{\prime}(z)+\rho xf(z)$ to obtain the second equality, we have that

[TABLE]

where the final equality follows from (2.4) with $f$ replaced by $f^{\prime}$ .

Now, we extend to $n\geq 1$ . Let $W=n\overline{Z}=\sum_{i=1}^{n}Z_{i}$ . Then, by conditioning,

[TABLE]

and on substituting $f(x)=g(x/n)$ into (2.5) we obtain (2.3). ∎

Corollary 2.2.

The PDF $p(x)$ of $\overline{Z}$ satisfies the ODE

[TABLE]

Proof.

Let $f$ be defined as in Proposition 2.1, and denote this class of functions $\mathcal{F}$ . Then, applying integration by parts to (2.3) gives that

[TABLE]

Since (2.7) holds for all $f\in\mathcal{F}$ , we deduce that $p$ satisfies the ODE (2.6). ∎

Proof of (1.1). The general solution to (2.6) is given by

[TABLE]

where $A$ and $B$ are arbitrary constants and $I_{\nu}(x)=\sum_{k=0}^{\infty}\frac{(x/2)^{2k+\nu}}{k!\Gamma(k+\nu+1)}$ is a modified Bessel function of the first kind. That this is the general solution can be deduced from the fact that the general solution to the modified Bessel differential equation

[TABLE]

is given by $h(x)=CK_{\nu}(x)+DI_{\nu}(x)$ (see Olver et al. 2010). Now, for $p$ to be a PDF we require that $\int_{-\infty}^{\infty}p(x)\,\mathrm{d}x=1$ . But, for any $\nu\in\mathbb{R}$ , as $x\rightarrow\infty$ , $I_{\nu}(x)\sim\frac{1}{\sqrt{2\pi x}}\mathrm{e}^{x}$ , and so we must take $B=0$ . We can also find $A$ by using the integral formula (which can be obtained by using the series expansion $\mathrm{e}^{\beta x}=\sum_{k=0}^{\infty}\frac{1}{k!}(\beta x)^{k}$ followed by formula (10.43.19) of Olver et al. [23])

[TABLE]

from which we deduce that the PDF of $\overline{Z}$ is given by (1.1). $\Box$

Remark 2.3.

An interesting way in which this work could be extended would be to repeat the analysis in the more general setting of the product of two correlated normal variates with non-zero mean vector; the exact distribution of this random variable is not known in the current literature. This is in principle possible, but one faces additional technical difficulties. A Stein characterisation for this distribution in the special case of zero correlation (for which the PDF has already be given by Cui et al. [7]) is given in Proposition 3.3 of Gaunt, Mijoule and Swan [13]. Applying the same argument that has been used in this paper yields a fourth order ODE that the density must satisfy; see Section 3.3.2 of Gaunt, Mijoule and Swan [13]. As noted, in that work, it is a difficult task to solve the ODE, and this would be even more challenging for the ODE corresponding to the more general uncorrelated case.

Acknowledgements

The author is supported by a Dame Kathleen Ollerenshaw Research Fellowship. The author would like to thank the referees for their helpful comments.

Bibliography27

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Anastasiou, A. and Reinert, G. Bounds for the normal approximation of the maximum likelihood estimator. Bernoulli 𝟐𝟑 23 \mathbf{23} (2017), pp. 191–218.
2[2] Aroian, L. A. The probability function of the product of two normally distributed variables. Ann. Math. Stat. 𝟏𝟖 18 \mathbf{18} (1944), pp. 265–271.
3[3] Aroian, L. A., Taneja, V. S. and Cornwell, L. W. Mathematical forms of the distribution of the product of two normal variables. Commun. Stat. Theory 𝟕 7 \mathbf{7} (1978), pp. 165–172.
4[4] Arras, B., Azmoodeh, E., Poly, G. and Swan, Y. Stein characterizations for linear combinations of gamma random variables. To appear in Braz. J. Probab. Stat. , 2019+.
5[5] Bandi, M. M. and Connaughton, C. Craig’s X Y 𝑋 𝑌 XY distribution and the statistics of Lagrangian power in two-dimensional turbulence. Phys. Rev. E 𝟕𝟕 77 \mathbf{77} (2008), 036318.
6[6] Craig, C. C. On the Frequency Function of x y 𝑥 𝑦 xy , Ann. Math. Stat. 𝟕 7 \mathbf{7} (1936), pp. 1–15.
7[7] Cui, G., Yu, X. Iommelli, S. and Kong, L. Exact Distribution for the Product of Two Correlated Gaussian Random Variables. IEEE Signal Process. Lett. 𝟐𝟑 23 \mathbf{23} (2016), pp. 1662–1666.
8[8] Franceschetti, M. and Meester, R. Critical node lifetimes in random networks via the Chen-Stein method. IEEE T. Inform. Theory 𝟓𝟐 52 \mathbf{52} (2006), pp. 2831–2837.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Stein’s method and the distribution of the product of zero mean correlated normal random variables

Abstract

1 Introduction

2 Proof of (1.1) via a Stein characterisation of the distribution

Proposition 2.1**.**

Proof.

Corollary 2.2**.**

Proof.

Remark 2.3**.**

Acknowledgements

Proposition 2.1.

Corollary 2.2.

Remark 2.3.