Semiparametric estimation in the normal variance-mean mixture model

Denis Belomestny; Vladimir Panov

arXiv:1705.07578·stat.OT·May 23, 2017

Semiparametric estimation in the normal variance-mean mixture model

Denis Belomestny, Vladimir Panov

PDF

Open Access

TL;DR

This paper introduces a semiparametric estimation method for variance-mean mixture models, focusing on estimating the normal mean and the mixing distribution density, with demonstrated effectiveness on simulated and real data.

Contribution

It presents a novel two-step semiparametric estimation procedure for variance-mean mixtures, combining parametric mean estimation with nonparametric mixing density recovery.

Findings

01

Effective estimation demonstrated on simulated data

02

Successful application to real financial data

03

Improved understanding of mixture model parameters

Abstract

In this paper we study the problem of statistical inference on the parameters of the semiparametric variance-mean mixtures. This class of mixtures has recently become rather popular in statistical and financial modelling. We design a semiparametric estimation procedure that first estimates the mean of the underlying normal distribution and then recovers nonparametrically the density of the corresponding mixing distribution. We illustrate the performance of our procedure on simulated and real data.

Equations189

p (x; μ, G) = \int_{R_{+}} φ_{N (μ s, s)} (x) G (d s) = \int_{R_{+}} \frac{1}{2 π s} exp {- \frac{( x - μ s ) ^{2}}{2 s}} G (d s),

p (x; μ, G) = \int_{R_{+}} φ_{N (μ s, s)} (x) G (d s) = \int_{R_{+}} \frac{1}{2 π s} exp {- \frac{( x - μ s ) ^{2}}{2 s}} G (d s),

X = d μ ξ + ξ η, η \sim N (0, 1), ξ \sim G .

X = d μ ξ + ξ η, η \sim N (0, 1), ξ \sim G .

p_{K} (x; μ, G) = \int_{R_{+}} φ_{N (α + β / s, 1/ s)} (x) G (d s)

p_{K} (x; μ, G) = \int_{R_{+}} φ_{N (α + β / s, 1/ s)} (x) G (d s)

p (x; μ, G) = e^{xμ} I_{μ, G} (- x^{2} /2),

p (x; μ, G) = e^{xμ} I_{μ, G} (- x^{2} /2),

I_{μ, G} (u) := \int_{R_{+}} \frac{1}{2 π s} exp {\frac{u}{s} - \frac{μ ^{2}}{2} s} G (d s) .

I_{μ, G} (u) := \int_{R_{+}} \frac{1}{2 π s} exp {\frac{u}{s} - \frac{μ ^{2}}{2} s} G (d s) .

p (- x; μ, G) = e^{- xμ} I_{μ, G} (- x^{2} /2),

p (- x; μ, G) = e^{- xμ} I_{μ, G} (- x^{2} /2),

μ = \frac{1}{2 x} lo g (\frac{p ( x ; μ , G )}{p ( - x ; μ , G )}) .

μ = \frac{1}{2 x} lo g (\frac{p ( x ; μ , G )}{p ( - x ; μ , G )}) .

μ = \int_{R} \frac{1}{2 x} lo g (\frac{p ( x ; μ , G )}{p ( - x ; μ , G )}) p (x; μ, G) d x,

μ = \int_{R} \frac{1}{2 x} lo g (\frac{p ( x ; μ , G )}{p ( - x ; μ , G )}) p (x; μ, G) d x,

H (p) := - \int_{R} lo g (p (x; μ, G)) p (x; μ, G) d x .

H (p) := - \int_{R} lo g (p (x; μ, G)) p (x; μ, G) d x .

w (x) \leq 0, x \geq 0, w (- x) = - w (x), supp (w) \subset [- A, A]

w (x) \leq 0, x \geq 0, w (- x) = - w (x), supp (w) \subset [- A, A]

W (ρ) := E [e^{- ρX} w (X)], ρ \in R,

W (ρ) := E [e^{- ρX} w (X)], ρ \in R,

μ_{n} := in f {ρ > 0 : W_{n} (ρ) = 0} \land M

μ_{n} := in f {ρ > 0 : W_{n} (ρ) = 0} \land M

W_{n} (ρ) := \frac{1}{n} i = 1 \sum n e^{- ρ X_{i}} w (X_{i}) .

W_{n} (ρ) := \frac{1}{n} i = 1 \sum n e^{- ρ X_{i}} w (X_{i}) .

W_{n}^{'} (ρ) = \frac{1}{n} i = 1 \sum n (- X_{i}) e^{- ρ X_{i}} w (X_{i}) \geq 0

W_{n}^{'} (ρ) = \frac{1}{n} i = 1 \sum n (- X_{i}) e^{- ρ X_{i}} w (X_{i}) \geq 0

Λ (M, p) := (1 + e^{- M X}) X w (X)_{p} < \infty.

Λ (M, p) := (1 + e^{- M X}) X w (X)_{p} < \infty.

∥ μ_{n} - μ ∥_{p} \leq \frac{K M ^{1 - 1/ p}}{n ^{1/2}} [\frac{1}{W ^{'} ( μ )} + \frac{1}{W ( M /2 )}]

∥ μ_{n} - μ ∥_{p} \leq \frac{K M ^{1 - 1/ p}}{n ^{1/2}} [\frac{1}{W ^{'} ( μ )} + \frac{1}{W ( M /2 )}]

ϕ_{X} (u) := E [e^{i u X}] = E [e^{ξ ψ (u)}] = L_{ξ} (ψ (u)),

ϕ_{X} (u) := E [e^{i u X}] = E [e^{ξ ψ (u)}] = L_{ξ} (ψ (u)),

M [L_{ξ}] (z) := \int_{R_{+}} L_{ξ} (u) u^{z - 1} d u,

M [L_{ξ}] (z) := \int_{R_{+}} L_{ξ} (u) u^{z - 1} d u,

\int_{R_{+}} L_{ξ} (u) u^{z - 1} d u = \int_{l} L_{ξ} (w) w^{z - 1} d w,

\int_{R_{+}} L_{ξ} (u) u^{z - 1} d u = \int_{l} L_{ξ} (w) w^{z - 1} d w,

M [L_{ξ}] (z) = \int_{R_{+}} L_{ξ} (ψ (u)) [ψ (u)]^{z - 1} ψ^{'} (u) d u = \int_{R_{+}} ϕ_{X} (u) [ψ (u)]^{z - 1} ψ^{'} (u) d u

M [L_{ξ}] (z) = \int_{R_{+}} L_{ξ} (ψ (u)) [ψ (u)]^{z - 1} ψ^{'} (u) d u = \int_{R_{+}} ϕ_{X} (u) [ψ (u)]^{z - 1} ψ^{'} (u) d u

M [L_{ξ}] (z) := ⎩ ⎨ ⎧ \int_{0}^{U_{n}} ϕ_{n} (u) [ψ (u)]^{z - 1} ψ^{'} (u) d u, \int_{0}^{U_{n}} \overline{ϕ_{n} (u)} [\overline{ψ (u)}]^{z - 1} \overline{ψ^{'} (u)} d u, μ Im (z) < 0, μ Im (z) > 0,

M [L_{ξ}] (z) := ⎩ ⎨ ⎧ \int_{0}^{U_{n}} ϕ_{n} (u) [ψ (u)]^{z - 1} ψ^{'} (u) d u, \int_{0}^{U_{n}} \overline{ϕ_{n} (u)} [\overline{ψ (u)}]^{z - 1} \overline{ψ^{'} (u)} d u, μ Im (z) < 0, μ Im (z) > 0,

ϕ_{n} (u) = \frac{1}{n} k = 1 \sum n e^{i u X_{k}},

ϕ_{n} (u) = \frac{1}{n} k = 1 \sum n e^{i u X_{k}},

∣ [ψ (u)]^{z} ∣ = exp {(Re (z) /2) lo g (μ^{2} u^{2} + u^{4} /4)} \cdot exp {Im z \cdot arctan (2 μ / u)}

∣ [ψ (u)]^{z} ∣ = exp {(Re (z) /2) lo g (μ^{2} u^{2} + u^{4} /4)} \cdot exp {Im z \cdot arctan (2 μ / u)}

[\overline{ψ (u)}]^{z} = exp {(Re (z) /2) lo g (μ^{2} u^{2} + u^{4} /4)} \cdot exp {Im z \cdot arctan (- 2 μ / u)}

[\overline{ψ (u)}]^{z} = exp {(Re (z) /2) lo g (μ^{2} u^{2} + u^{4} /4)} \cdot exp {Im z \cdot arctan (- 2 μ / u)}

M [L_{ξ}] (z)

M [L_{ξ}] (z)

M [g] (z) = \frac{M [ L _{ξ} ] ( 1 - z )}{Γ ( 1 - z )} = \frac{1}{Γ ( 1 - z )} \int_{R_{+}} ϕ_{X} (u) [ψ (u)]^{- z} ψ^{'} (u) d u .

M [g] (z) = \frac{M [ L _{ξ} ] ( 1 - z )}{Γ ( 1 - z )} = \frac{1}{Γ ( 1 - z )} \int_{R_{+}} ϕ_{X} (u) [ψ (u)]^{- z} ψ^{'} (u) d u .

M [g] (z) := \frac{M [ L _{ξ} ] ( 1 - z )}{Γ ( 1 - z )} .

M [g] (z) := \frac{M [ L _{ξ} ] ( 1 - z )}{Γ ( 1 - z )} .

g (x) = \frac{1}{2 π} \int_{R} M [g] (γ + i v) \cdot x^{- γ - i v} d v

g (x) = \frac{1}{2 π} \int_{R} M [g] (γ + i v) \cdot x^{- γ - i v} d v

g_{n, γ}^{\circ} (x)

g_{n, γ}^{\circ} (x)

E (α, γ_{\circ}, γ^{\circ}, L)

E (α, γ_{\circ}, γ^{\circ}, L)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Financial Risk and Volatility Modeling

Full text

Semiparametric estimation in the normal variance-mean mixture model

Denis Belomestnylabel=e1][email protected] [ University of Duisburg-Essen

Thea-Leymann-Str. 9, 45127 Essen, Germany

and

Laboratory of Stochastic Analysis and its Applications

National Research University Higher School of Economics

Shabolovka, 26, 119049 Moscow, Russia

Vladimir Panovlabel=e2][email protected] [ Laboratory of Stochastic Analysis and its Applications

National Research University Higher School of Economics

Shabolovka, 26, 119049 Moscow, Russia

Abstract

In this paper we study the problem of statistical inference on the parameters of the semiparametric variance-mean mixtures. This class of mixtures has recently become rather popular in statistical and financial modelling. We design a semiparametric estimation procedure that first estimates the mean of the underlying normal distribution and then recovers nonparametrically the density of the corresponding mixing distribution. We illustrate the performance of our procedure on simulated and real data.

variance-mean mixture model,

semiparametric inference,

Mellin transform,

generalized hyperbolic distribution,

keywords:

T1 This work has been funded by the Russian Academic Excellence Project “5-100”.

1 Introduction and set-up

A normal variance-mean mixture is defined as

[TABLE]

where $\mu\in\mathbb{R},$ $\varphi_{\mathcal{N}(\mu s,s)}$ stands for the density of a normal distribution with mean $\mu s$ and variance $s$ , and $G$ is a mixing distribution on $\mathbb{R}_{+}.$ As can be easily seen, a random variable $X$ has the distribution (1) if and only if

[TABLE]

The variance-mean mixture models play an important role in statistical modelling and have many applications. In particular, such mixtures appear as limit distributions in the asymptotic theory for dependent random variables and they are also useful for modelling data stemming from heavy-tailed and skewed distributions, see, e.g. Barndorff-Nielsen, Kent and Sørensen [6], Barndorff-Nielsen [4], Bingham and Kiesel [9], Bingham, Kiesel and Schmidt [10]. If $G$ is the generalized inverse Gaussian distribution, then the normal variance-mean mixture distribution coincides with the so-called generalized hyperbolic distribution. The latter distribution has an important property that the logarithm of its density function is a smooth unimodal curve approaching linear asymptotes. This type of distributions was used to model the sizes of the particles of sand (Bagnold [2], Barndorff-Nielsen and Christensen [5]), or the diamond sizes in marine deposits in South West Africa (Barndorff-Nielsen [3]).

In this paper we study the problem of statistical inference for the mixing distribution $G$ and the parameter $\mu$ based on a sample $X_{1},\ldots,X_{n}$ from the distribution with density $p(\cdot;\mu,G).$ This problem was already considered in the literature, but mainly in the parametric situations. For example, in the case of the generalised hyperbolic distributions some parametric approaches can be found in Jørgensen [12], and Karlis and Lillestöl [13]. There are also few papers dealing with the general semiparametric case. For example, Korsholm [14] considered the statistical inference for a more general model of the form

[TABLE]

and proved the consistency of the non-parametric maximum likelihood estimator for the parameters $\alpha$ and $\beta$ , whereas $G$ was treated as an nuisance probability distribution. Although the maximum likelihood (ML) approach of Korsholm is rather general, its practical implementation would meat serious computational difficulties, since one would need to solve rather challenging optimization problem. Note that the ML approach for similar models was also considered by van der Vaart [19]. Among other papers on relevant topic, let us mention the paper by Tjetjep and Seneta [18], where the method of moments was used for some special cases of the model (1), and the paper by Zhang [20], which is devoted to the problem of estimating the mixing density in location (mean) mixtures.

The main contribution of this paper is a new computationally efficient estimation approach which can be used to estimate both the parameter $\mu$ and the mixing distribution $G$ in a consistent way. This approach employs the Mellin transform technique and doesn’t involve any type of high-dimensional optimisation. We show that while our estimator of $\mu$ converges with parametric rate, a nonparametric estimator of the density of $G$ has much slower convergence rates.

The paper is organized as follows. In Section 2 the problem of statistical inference for $\mu$ is studied. Section 3 is devoted to the estimation of $G$ under known $\mu$ and Section 4 extends the results of Section 3 to the case of unknown $\mu.$ A simulation study is presented in Section 5 and a real data example can be found in Section 6.

2 Estimation of $\mu$

First note that the density in the normal variance-mean model can be represented in the following form

[TABLE]

where

[TABLE]

This observation in particularly implies that

[TABLE]

and therefore, dividing (4) by (5), we get

[TABLE]

The formula (6) represents $\mu$ in terms of $p(\cdot;\mu,G).$ The representation (6) can also be written in the form

[TABLE]

which looks similar to the entropy of $p(\cdot;\mu,G):$

[TABLE]

For a comprehensive overview of the methods of estimating $H(p)$ , we refer to [7]. Note also that the estimation of the functionals like (7) was considered in [16]. Typically, the parametric convergence rates for the estimators of such functionals can be achieved only under very restrictive assumptions on the density $p(x).$ In an approach presented below, we avoid these restrictive conditions and prove a square root convergence under very mild assumptions. Let $w(x)$ be a Lipschitz continuous function on $\mathbb{R}$ satisfying

[TABLE]

for some $A>0.$ Set

[TABLE]

then the function $W(\rho)$ is monotone and $W(\mu)=0.$ This property suggests the following method to estimate $\mu.$ Without loss of generality we may assume that $\mu\in[0,M/2)$ for some $M>0.$ Set

[TABLE]

with

[TABLE]

Note that since $\lim_{\rho\to-\infty}W_{n}(\rho)\leq 0,$ $\lim_{\rho\to\infty}W_{n}(\rho)\geq 0$ and

[TABLE]

for all $\rho\in\mathbb{R},$ the function $W_{n}(\rho)$ is monotone and $\mu_{n}$ is unique. The following theorem describes the convergence properties of $\mu_{n}$ in terms of the norm $\left\|\mu-\mu_{n}\right\|_{p}:=\left({\mathbb{E}}\left|\hat{\mu}_{n}-\mu\right|^{p}\right)^{1/p},$ where $p\geq 2.$

Theorem 2.1.

Let $p\geq 2$ and $M>0$ be such that

[TABLE]

Then

[TABLE]

with a constant $K$ depending on $p$ and $\Lambda(M,p)$ only.

3 Estimation of $G$ with known $\mu$

In this section, we assume that the distribution function $G$ has a Lebesgue density $g$ and our aim is to estimate $g$ from the i.i.d. observations $X_{1},\ldots,X_{n}$ of the random variable $X$ with the density $p(x;\mu,G),$ provided that the parameter $\mu$ is known. The idea of the estimation procedure is based on the following observation. Due to the representation (2), the characteristic function of $X$ has the form:

[TABLE]

where ${\mathcal{L}}_{\xi}(x):={\mathbb{E}}[e^{-\xi x}]=\int_{{\mathbb{R}}}e^{-sx}g(s)\,ds$ is the Laplace transform of the r.v. $\xi$ and $\psi(u)=-\mathrm{i}\mu u+u^{2}/2$ is the characteristic exponent of the normal r.v. with mean $\mu$ and variance $1.$ Our approach is based on the use of the Mellin transform technique. Set

[TABLE]

then by the integral Cauchy theorem

[TABLE]

where $l$ is the curve on the complex plane defined as the image of $\psi(u)$ by mapping from ${\mathbb{R}}$ to ${\mathbb{C}},$ that is, $l$ is the set of points $z\in{\mathbb{C}}$ satisfying $\mathrm{Im}(z)=-\mu\sqrt{2\;\mathrm{Re}(z)}.$ Therefore, we get

[TABLE]

so that the Mellin transform $\mathcal{M}\left[{\mathcal{L}}_{\xi}\right](z)$ can be estimated from data via

[TABLE]

where

[TABLE]

and $U_{n}$ is sequence of positive numbers tending to infinity as $n\to\infty.$ This choice of the estimate for ${\mathcal{M}}\left[{\mathcal{L}}_{\xi}\right](z)$ is motivated by the fact that the function

[TABLE]

is bounded for any $u\geq 0$ iff $\mu\;\mathrm{Im}(z)<0$ , and the function

[TABLE]

is bounded for any $u\geq 0$ iff $\mu\;\mathrm{Im}(z)>0$ . Therefore, both integrals in (10) converge. Moreover, note that this estimate possesses the property $\overline{\mathcal{M}\left[{\mathcal{L}}_{\xi}\right](\overline{z})}=\mathcal{M}\left[{\mathcal{L}}_{\xi}\right](z),$ which also holds for the original Mellin transform $\mathcal{M}\left[{\mathcal{L}}_{\xi}\right](z).$

The Mellin transform $\mathcal{M}\left[{\mathcal{L}}_{\xi}\right](z)$ is closely connected to the Mellin tranform of the density $g.$ Indeed,

[TABLE]

Therefore, the Mellin transform of the density of the r.v. $\xi$ can be represented as

[TABLE]

Using the last expression and taking into account (10), we define the estimate of $\mathcal{M}\left[g\right](z)$ by

[TABLE]

Finally, we apply the inverse Mellin transform to estimate the density $g$ of the r. v. $\xi.$ Since the inverse Mellin transform of $\mathcal{M}\left[g\right](\gamma+\mathrm{i}v)$ is given by

[TABLE]

for any $\gamma\in(0,1),$ we define the estimate of the mixing density $g$ via

[TABLE]

for some $\gamma\in(0,1)$ and a sequence $V_{n}\to\infty$ as $n\to\infty.$ The convergence rates of the estimate $\widehat{g}_{n,\gamma}^{\circ}$ crucially depend on the asymptotic behavior of the Mellin transform of the true density function $g.$ In order to specify this behavior, we introduce two classes of probability densities:

[TABLE]

where $\alpha,\beta\in{\mathbb{R}}_{+},$ $L>0$ , $0<\gamma_{\circ}<\gamma^{\circ}<1$ . For instance, the gamma-distribution belongs to the first class, and the beta-distribution - to the second, see [8].

The following convergence rates are proved in Section 7.2.

Theorem 3.1.

Let $U_{n}=n^{1/4}$ and $V_{n}=\kappa\ln(n)$ for some $\kappa>0.$

(i)

If $g\in\mathcal{E}(\alpha,\gamma_{\circ},\gamma^{\circ},L)$ for some $\alpha\in{\mathbb{R}}_{+},L>0$ , $0<\gamma_{\circ}<\gamma^{\circ}<1/2,$ then under the choice $\kappa=\gamma^{\circ}/(\pi+2\alpha),$ it holds for any $x\in{\mathbb{R}}_{+},$

[TABLE]

*for any $\gamma\in(\gamma_{\circ},\gamma^{\circ}),$ where $\lesssim$ stands for an inequality up to some positive finite constant depending on $\alpha,\gamma_{\circ},\gamma^{\circ}$ and $L.$ * 2. (ii)

If $g\in\mathcal{P}(\beta,\gamma_{\circ},\gamma^{\circ},L)$ for some $\beta\in{\mathbb{R}}_{+},L>0$ , $0<\gamma_{\circ}<\gamma^{\circ}<1/2,$ then for any $\kappa>0$ and any $x\in{\mathbb{R}}_{+},$

[TABLE]

for any $\gamma\in(\gamma_{\circ},\gamma^{\circ}),$ where $\lesssim$ stands for an inequality up to some positive finite constant depending on $\beta,\gamma_{\circ},\gamma^{\circ}$ and $L.$

4 Estimation of $G$ with unknown $\mu$

Using the same strategy as in the previous section and substituting the true value $\mu$ by the estimate $\widehat{\mu}_{n},$ we arrive at the following estimate of the density function $g$ in the case of an unknown $\mu:$

[TABLE]

where $\widehat{\psi}_{n}(u):=-\mathrm{i}\widehat{\mu}_{n}u+u^{2}/2,\;u\in{\mathbb{R}}.$ The next theorem shows that the difference between $\widehat{g}_{n,\gamma}(x)$ and $\widehat{g}_{n,\gamma}^{\circ}(x)$ is basically of order $\widehat{\mu}_{n}-\mu.$

Theorem 4.1.

Let the assumptions of Theorem 3.1 be fulfilled, $U_{n}=n^{1/4},\;V_{n}=\kappa\ln(n)$ for some $\kappa>0$ and $\mu\neq 0$ . Furthermore, let $\widehat{\mu}_{n}$ be a consistent estimate of $\mu.$ Then for any $x\in{\mathbb{R}},$

[TABLE]

for any $\gamma\in(\gamma_{\circ},\gamma^{\circ}),$ where

[TABLE]

and $\beta_{n},\delta_{n}$ are positive deterministic sequences such that

[TABLE]

as $n\to\infty,$ where $\lesssim$ stands for an inequality with some positive finite constant depending on the parameters of the corresponding class. In particular, in the setup of Theorem 3.1(i), $\beta_{n}\lesssim n^{-1/8}(\ln(n))^{1/2},$ $\delta_{n}\lesssim n^{1/8}\ln(n).$

Corollary 4.2.

In the setup of Theorem 2.1, it holds

[TABLE]

for any $\gamma\in(\gamma_{\circ},\gamma^{\circ}).$

5 Numerical example

In this section, we illustrate the performance of estimation our algorithm in the case, when $G$ is the distribution function of the so-called generalized inverse Gaussian distribution $GIG\left(\lambda,\delta,\psi\right)$ with a density

[TABLE]

where $\lambda\in{\mathbb{R}},\delta>0,\psi>0,$ and $K_{\lambda}(x)=(1/2)\int_{{\mathbb{R}}_{+}}u^{\lambda-1}\exp\left\{u+u^{-1}\right\}du$ is the Bessel function of the third kind. Trivially, $GIG\left(\lambda,\delta,\psi\right)$ is an exponential class of distributions. Furthermore, it is interesting to note that these distributions are self-decomposable, see [11], and therefore infinitely divisible.

With this choice of the mixing distribution $G$ , the random variable $X$ defined by (2), has the so-called generalized hyperbolic distribution, GH $\left(\alpha,\lambda,\delta,\psi\right)$ ( $\alpha=\sqrt{\psi^{2}+\mu^{2}}$ ), with a density function, which can be explicitly computed via (1). In particular, in the case $\lambda=1,$ the density function is of the form

[TABLE]

It would be an interesting to note that the plot of the log-density has two asymptotes $y=[\log(\psi)-\log(2\alpha\delta K_{1}(\delta\psi))]+(\mu\pm\alpha)x$ , see Figure 1. For some other properties of this distribution, we refer to [9].

The aim of this simulation study is to estimate $\mu$ and $g$ based on the observations $X_{1},\ldots,X_{n}$ of the r.v. $X$ . Following the idea of Section 2, we first choose the odd weighting function

[TABLE]

Note that $w(x)$ is bounded and supported on $[-\pi,\pi].$ For our numerical study, we take $\lambda=\delta=\psi=1,$ and $\mu=1/2.$ The boxplots of the estimate $\widehat{\mu}_{n}$ based on $100$ simulation runs are presented on Figure 2.

Next, we estimate the density function $g(x)$ for $x\in\{x_{1},\ldots,x_{M}\},$ where $\{x_{1},\ldots,x_{M}\}$ constitute an equidistant grid on $[0.1,5].$ To this end, we use the estimate constructed in Section 3,

[TABLE]

where $\widehat{\phi}_{n}(u)=n^{-1}\sum_{k=1}^{n}e^{-\mathrm{i}uX_{k}}$ is the empirical characteristic function of the random variable $X$ . The error of estimation is measured by

[TABLE]

We take $\gamma=0.1$ and the parameters $U_{n}$ and $V_{n}$ are chosen by numerical optimization of the functional $R(\widehat{g}_{n}^{\circ}),$ which yields in our case the values $U_{n}=7.6,$ and $V_{n}=0.9.$ Following the ideas of Section 4, we consider also the estimate $\widehat{g}_{n,\gamma}(x),$ which is obtained from $\widehat{g}_{n}^{\circ}(x)$ by replacing $\mu$ with its estimate $\widehat{\mu}_{n},$ see (14). The difference between $\widehat{g}^{\circ}_{n,\gamma}(x)$ and $\widehat{g}_{n,\gamma}(x)$ (which was theoretically considered in Theorem 4.1) is illustrated by boxplots on Figure 3, which shows that the quality of these estimates is essentially the same.

6 Real data example

In this section, we provide an example of the application of our model (1) for describing the diamond sizes in marine deposits in South West Africa. The motivation for using this model in this problem can be found in the paper by Sichel [17]: “According to one geological theory diamonds were transported from inland down the Orange River… One would expect then that the diamond traps would catch the larger stones preferentially and that the average stone weights would decrease as distance from the river mouth increased. Intensive sampling has actually proved this hypothesis to be correct…”

Later, Sichel claims that although “for relatively small mining areas, and particular for a single-trench unit, the size distributions appear to follow the two-parameter lognormal law,” for large mining areas, one should expect that the parameters of the lognormal law depend on the distance from the mouth of the river. Moreover, taking into account the geological studies, it is reasonable to assume that these parameters are related inversely to the distance from the mouth of the river and related directly to each other. Based on these ideas, Sichel proposes to use the model (1) (or, more precisely, a slightly more general model (3)) with $G$ corresponding to the gamma distribution. Later, Barndorff-Nielsen [3] applied the same model with $G$ corresponding to the generalized inverse Gaussian distribution, which was presented above in Section 5.

Below we apply our approach to the same data, which can be found (in aggregated form) both in [3] (p. 409) and [17] (p. 242). We have 1022 observations of stone sizes, measured in carats, and aim to fit the model (1) to the density of the logarithms of these sizes. The estimation scheme consists of 2 steps.

First, we estimate the parameter $\mu$ by $\widehat{\mu}_{n}$ defined in (8). In this example, we got an estimated value of the parameter $\mu$ equal to $\hat{\mu}_{n}=0.068$ . Note that the positive sign of this estimate is important due to the demand on direct relation between the parameters. 2. 2.

Second, we estimate the density $g(s)$ by $\widehat{g}_{n}(s)$ defined in (14) for $s=\{s_{1},\ldots,s_{m}\}$ from the equidistant grid on $[0.1,8]$ with step $\Delta_{s}$ . The plot of this function is given as Figure 4.

To illustrate the performance of our procedure, we also estimate the density fitted by the model (1):

[TABLE]

The performance of this estimate can be visually checked by Figure 5.

7 Proofs

7.1 Proof of Theorem 2.1

Note that

[TABLE]

The first summand in the r.h.s. can be bounded by taking into account that

[TABLE]

for some $\widetilde{\mu}_{n}\in(\mu\wedge\mu_{n},\mu\vee\mu_{n})$ and hence on the event $\{\mu_{n}<M\}$ it holds

[TABLE]

Note that the function

[TABLE]

is positive on $\mathbb{R}_{+}$ and attains its minimum at $\rho=\mu$ with

[TABLE]

Hence

[TABLE]

and continuing line of reasoning in (16), we get

[TABLE]

Furthermore, due to the monotonicity of $W$ and the fact that $W_{n}(M/2)<0$ on the event $\{\mu_{n}=M\},$ we get

[TABLE]

Set $\Delta_{n}(\rho):=\sqrt{n}\left(W_{n}(\rho)-W(\rho)\right),$ then

[TABLE]

with $\rho,\rho^{\prime}\in[0,M].$ By the Rosenthal and Lyapunov inequalities

[TABLE]

where the second inequality follows from $|e^{x}-e^{y}|\leq|x-y|\left(e^{x}+e^{y}\right),\forall x,y>0.$ Next, applying the maximal inequality (see Theorem 8.4 in [15]), we get

[TABLE]

for some constants $K$ and $K^{\prime}$ depending on $p$ and $\left\|e^{-MX}w(X)\right\|_{p}.$ Hence

[TABLE]

Finally, we get the result

[TABLE]

7.2 Proof of Theorem 3.1

1. The bias of $\widehat{g}_{n,\gamma}^{\circ}(x)$

[TABLE]

Taking into account (11), we get that the last term in this representation can be written as

[TABLE]

where

[TABLE]

Substituting these expressions into (17) and taking into account that $\overline{\psi(u)}=\psi(-u)$ and $\overline{\psi^{\prime}(u)}=-\psi^{\prime}(-u)$ , we derive

[TABLE]

where

[TABLE]

Upper bound for $J_{3}$ directly follows from our assumption on the asymptotic Mellin transform $\mathcal{M}\left[g\right].$ In the exponential case,

[TABLE]

whereas in the polynomial case $J_{3}\leq L(2\pi)^{-1}|x|^{-\gamma}V_{n}^{-\beta}.$ To show the asymptotical behavior of $J_{1}$ , $J_{2}$ , we first derive the upper bound for the characteristic function $\phi_{X}(u):$

[TABLE]

where the last asymptotic inequality follows from the integration by parts:

[TABLE]

for all $w\in{\mathbb{C}}.$ Second, it holds

[TABLE]

for any $w\in{\mathbb{C}}$ such that $\mathrm{Re}(w)\geq-2,|\mathrm{Im}(w)|\geq 1$ and some constant $C>0$ , see [1]. Therefore, we get

[TABLE]

2. Next, we consider the variance of the estimate $\widehat{g}_{n,\gamma}^{\circ}:$

[TABLE]

where

[TABLE]

The last inequality follows from the fact that for any integrable function $f:{\mathbb{C}}^{1+m}\to{\mathbb{C}}$ and any random variable $Z$ it holds

[TABLE]

It would be a worth mentioning that

[TABLE]

and moreover it holds

[TABLE]

where we use the assumption $\gamma<1/2$ . Finally, we conclude that

[TABLE]

where we use that

[TABLE]

3. Set $\rho(x)=|x|^{2\gamma},$ then we have the following the bias-variance decomposition:

[TABLE]

For instance, in the exponential case, this decomposition yields

[TABLE]

The last expression suggests the choice $U_{n}=n^{1/4},$ under which

[TABLE]

Choosing $V_{n}$ in the form $V_{n}=\kappa\ln(n),$ we arrive at the desired result.

7.3 Proof of Theorem 4.1

It’s easy to see that

[TABLE]

where

[TABLE]

Below we consider in details $\Lambda_{n}^{(1)}(u,v);$ the treatment for the second term follows the same lines. Denote

[TABLE]

In this notation,

[TABLE]

Note that

[TABLE]

Next, applying the Taylor theorem for the function $g(x)=(1+zx)^{w}:{\mathbb{R}}\to{\mathbb{C}}$ in the vicinity of zero with $z=\mathrm{i}/(\mathrm{i}\mu+u/2),\;w=-\gamma-\mathrm{i}v,\;x=\widehat{\mu}_{n}-\mu\in{\mathbb{R}},$ we get

[TABLE]

where

[TABLE]

where $\theta\in(0,1)$ and $\tilde{\mu}=\theta\widehat{\mu}_{n}+(1-\theta)\mu.$ Note that uniformly on $u\in[0,U_{n}]$ and $v\in[-V_{n},V_{n}]$ it holds

[TABLE]

and moreover $|r_{n}(u)(\mathrm{i}\mu+u)|\lesssim V_{n}^{2}e^{\pi V_{n}}.$ From (20) it follows then

[TABLE]

and

[TABLE]

In the sequel we assume for simplicity that on the second stage (estimation of $G$ ) we use another sample, independent of that was used for the estimation of $\mu$ . Substituting (21) into (19), we get (15) with

[TABLE]

Note that due to the Minkowski inequality ,

[TABLE]

Taking into account that $|\phi_{X}(\cdot)|\leq 1,$ $\left|-\left(\gamma+\mathrm{i}v\right)\left(\mathrm{i}\mu+u\right)\left(\mathrm{i}\mu+u/2\right)^{-1}+1\right|\lesssim 2V_{n}+1,$ and moreover

[TABLE]

we get that

[TABLE]

where we use the inequality (18). Therefore, under our choice of $U_{n}$ and $V_{n}$ , we get $\beta_{n}^{2}\lesssim n^{-(3/4)+\pi\kappa}\ln(n).$ Analogously,

[TABLE]

This observation completes the proof.

Bibliography20

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Andrews, G.E., Askey, R., and Roy, R. Special functions, volume 71 of Encyclopedia of mathematics and its applications . Cambridge University Press, 1999.
2[2] Bagnold, R.A. The physics of blown sand and desert dunes . Matthew, London, 1941.
3[3] Barndorff-Nielsen, O. Exponentially decreasing functions for the logarithm of particle size. Proc.R.Soc.London , A(353):401–419, 1977.
4[4] Barndorff-Nielsen, O. Normal inverse gaussian distributions and stochastic volatility modelling. Scandinavian Journal of statistics , 24(1):1–13, 1997.
5[5] Barndorff-Nielsen, O. and Christensen, C. Erosion, deposition, and size distributions of sand. Proc.R.Soc.London , A(417):335–352, 1988.
6[6] Barndorff-Nielsen, O., Kent, J., and Sórensen, M. Normal variance-mean mixtures and z distributions. International statistical review , 50:145–159, 1982.
7[7] Beirlant, J., Dudewicz, E. J., Györfi, L. and van der Meulen, E. C. Nonparametric entropy estimation: an overview. Int. J. Math. Stat. Sci. , 6(1):17–39, 1997.
8[8] Belomestny, D. and Panov, V. Statistical inference for generalized Ornstein-Uhlenbeck processes. Electron. J. Statist. , 9:1974–2006, 2015.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Semiparametric estimation in the normal variance-mean mixture model

Abstract

keywords:

Contents

1 Introduction and set-up

2 Estimation of μ\muμ

Theorem 2.1**.**

3 Estimation of GGG with known μ\muμ

Theorem 3.1**.**

4 Estimation of GGG with unknown μ\muμ

Theorem 4.1**.**

Corollary 4.2**.**

5 Numerical example

6 Real data example

7 Proofs

7.1 Proof of Theorem 2.1

7.2 Proof of Theorem 3.1

7.3 Proof of Theorem 4.1

2 Estimation of $\mu$

Theorem 2.1.

3 Estimation of $G$ with known $\mu$

Theorem 3.1.

4 Estimation of $G$ with unknown $\mu$

Theorem 4.1.

Corollary 4.2.