Grenander functionals and Cauchy's formula

Piet Groeneboom

arXiv:1902.08806·math.PR·November 21, 2019

Grenander functionals and Cauchy's formula

Piet Groeneboom

PDF

TL;DR

This paper extends the analysis of the Grenander estimator's properties using complex analysis techniques, correcting previous results and deriving new asymptotic distributions for integrals of the estimator.

Contribution

It introduces a general framework for analyzing Grenander functionals using Cauchy's formula, correcting earlier inaccuracies and expanding asymptotic results.

Findings

01

Extended the asymptotic distribution results for Grenander estimator integrals.

02

Corrected previous inaccuracies in the distribution of sums of gamma and Poisson variables.

03

Applied saddle point methods and Cauchy's formula to nonparametric estimator analysis.

Abstract

Let $\hat{f}_{n}$ be the nonparametric maximum likelihood estimator of a decreasing density. Grenander characterized this as the left-continuous slope of the least concave majorant of the empirical distribution function. For a sample from the uniform distribution, the asymptotic distribution of the $L_{2}$ -distance of the Grenander estimator to the uniform density was derived in Groeneboom and Pyke (1983) by using a representation of the Grenander estimator in terms of conditioned Poisson and gamma random variables. This representation was also used in Groeneboom and Lopuhaa (1993) to prove a central limit result of Sparre Andersen on the number of jumps of the Grenander estimator. Here we extend this to the proof of a general result on integrals of the Grenander estimator. We also correct Groeneboom and Pyke (1983), where the limit distribution of the sums of gamma and Poisson variables on…

Equations224

n {\hat{f}_{n} (t) - 1} ⟶ D S_{t},

n {\hat{f}_{n} (t) - 1} ⟶ D S_{t},

n^{1/3} ∣ 4 f (t) f^{'} (y) ∣^{- 1/3} {\hat{f}_{n} (t) - f (t)} ⟶ D Z,

n^{1/3} ∣ 4 f (t) f^{'} (y) ∣^{- 1/3} {\hat{f}_{n} (t) - f (t)} ⟶ D Z,

μ (h, f) = \int_{0}^{1} h (f (x)) d x,

μ (h, f) = \int_{0}^{1} h (f (x)) d x,

\frac{n { μ ( h , f ^ _{n} ) - μ ( h , f ) } - \frac{1}{2} h ^{''} ( 1 ) lo g n}{\frac{3}{4} h ^{''} ( 1 ) ^{2} lo g n} ⟶ D N (0, 1),

\frac{n { μ ( h , f ^ _{n} ) - μ ( h , f ) } - \frac{1}{2} h ^{''} ( 1 ) lo g n}{\frac{3}{4} h ^{''} ( 1 ) ^{2} lo g n} ⟶ D N (0, 1),

∣ h (1 - i t) ∣ = O (t^{2}), ∣ t ∣ \to \infty,

∣ h (1 - i t) ∣ = O (t^{2}), ∣ t ∣ \to \infty,

∣ h^{''} (1 - i t) ∣ = O (1), ∣ t ∣ \to \infty.

∣ h^{''} (1 - i t) ∣ = O (1), ∣ t ∣ \to \infty.

\frac{n { μ ( h , f ^ _{n} ) - μ ( h , f ) } - \frac{1}{2} h ^{''} ( 1 ) lo g n}{\frac{3}{4} h ^{''} ( 1 ) ^{2} lo g n} ⟶ D N (0, 1),

\frac{n { μ ( h , f ^ _{n} ) - μ ( h , f ) } - \frac{1}{2} h ^{''} ( 1 ) lo g n}{\frac{3}{4} h ^{''} ( 1 ) ^{2} lo g n} ⟶ D N (0, 1),

\displaystyle\frac{1}{\sqrt{3\log n}}\left\{n\int_{0}^{1}\bigl{\{}\hat{f}_{n}(t)-1\bigr{\}}^{2}\,dt-\log n\right\}\stackrel{{\scriptstyle{\cal D}}}{{\longrightarrow}}N(0,1),\qquad n\to\infty.

\displaystyle\frac{1}{\sqrt{3\log n}}\left\{n\int_{0}^{1}\bigl{\{}\hat{f}_{n}(t)-1\bigr{\}}^{2}\,dt-\log n\right\}\stackrel{{\scriptstyle{\cal D}}}{{\longrightarrow}}N(0,1),\qquad n\to\infty.

\frac{1}{\frac{3}{4} lo g n} {n \int_{0}^{1} \hat{f}_{n} (t) lo g \hat{f}_{n} (t) d t - \frac{1}{2} lo g n} ⟶ D N (0, 1), n \to \infty.

\frac{1}{\frac{3}{4} lo g n} {n \int_{0}^{1} \hat{f}_{n} (t) lo g \hat{f}_{n} (t) d t - \frac{1}{2} lo g n} ⟶ D N (0, 1), n \to \infty.

\displaystyle\begin{array}[]{ll}D_{ni}=\xi_{ni}-\xi_{n,i-1}&,i=1,\dots,m+1,\\ J_{ni}=n\left\{{\mathbb{F}}_{n}(\xi_{ni})-{\mathbb{F}}_{n}(\xi_{n,i-1})\right\}&,i=1,\dots,m+1,\\ Q_{nj}=\#\left\{i:J_{ni}=j\right\},&\end{array}

\displaystyle\begin{array}[]{ll}D_{ni}=\xi_{ni}-\xi_{n,i-1}&,i=1,\dots,m+1,\\ J_{ni}=n\left\{{\mathbb{F}}_{n}(\xi_{ni})-{\mathbb{F}}_{n}(\xi_{n,i-1})\right\}&,i=1,\dots,m+1,\\ Q_{nj}=\#\left\{i:J_{ni}=j\right\},&\end{array}

S_{n} = j = 1 \sum n i = 1 \sum N_{j} S_{j i}, T_{n} = j = 1 \sum n j N_{j} .

S_{n} = j = 1 \sum n i = 1 \sum N_{j} S_{j i}, T_{n} = j = 1 \sum n j N_{j} .

S^{(n)} = (S_{11}, \dots, S_{1, N_{1}}, \dots, S_{n 1}, \dots, S_{n, N_{n}}), N^{(n)} = (N_{1}, \dots, N_{n}) .

S^{(n)} = (S_{11}, \dots, S_{1, N_{1}}, \dots, S_{n 1}, \dots, S_{n, N_{n}}), N^{(n)} = (N_{1}, \dots, N_{n}) .

\displaystyle\left(nD_{n1},\dots,nD_{n,m+1};Q_{n1},\dots,Q_{n,m}\right)\stackrel{{\scriptstyle{\cal D}}}{{=}}\left({\bm{S}}^{(n)},{\bm{N}}^{(n)}\bigm{|}S_{n}=n,T_{n}=n\right)

\displaystyle\left(nD_{n1},\dots,nD_{n,m+1};Q_{n1},\dots,Q_{n,m}\right)\stackrel{{\scriptstyle{\cal D}}}{{=}}\left({\bm{S}}^{(n)},{\bm{N}}^{(n)}\bigm{|}S_{n}=n,T_{n}=n\right)

\displaystyle V_{n}=n^{-1/2}\biggl{\{}S_{n}-\sum_{j=1}^{n}jN_{j}\biggr{\}}=n^{-1/2}\left\{S_{n}-T_{n}\right\},\qquad W_{n}=\frac{T_{n}}{n},

\displaystyle V_{n}=n^{-1/2}\biggl{\{}S_{n}-\sum_{j=1}^{n}jN_{j}\biggr{\}}=n^{-1/2}\left\{S_{n}-T_{n}\right\},\qquad W_{n}=\frac{T_{n}}{n},

V_{n} = 0, W_{n} = 1.

V_{n} = 0, W_{n} = 1.

\frac{N _{j u m p s} - lo g n}{lo g n} ⟶ D N (0, 1),

\frac{N _{j u m p s} - lo g n}{lo g n} ⟶ D N (0, 1),

U_{n} = \frac{\sum _{j = 1}^{n} N _{j} - lo g n}{lo g n},

U_{n} = \frac{\sum _{j = 1}^{n} N _{j} - lo g n}{lo g n},

\displaystyle{\mathbb{E}}\left\{e^{isU_{n}}\bigm{|}T_{n}=n\right\}.

\displaystyle{\mathbb{E}}\left\{e^{isU_{n}}\bigm{|}T_{n}=n\right\}.

\displaystyle{\mathbb{P}}\left\{T_{n}=n\right\}=\exp\biggl{\{}-\sum_{j=1}^{n}\frac{1}{j}\biggr{\}}.

\displaystyle{\mathbb{P}}\left\{T_{n}=n\right\}=\exp\biggl{\{}-\sum_{j=1}^{n}\frac{1}{j}\biggr{\}}.

\displaystyle{\mathbb{E}}\left\{e^{isU_{n}}\bigm{|}T_{n}=n\right\}=\exp\Biggl{\{}\sum_{j=1}^{n}\frac{1}{j}\Biggr{\}}\frac{1}{2\pi}\int_{u=-\pi}^{\pi}{\mathbb{E}}\,e^{isU_{n}+iuT_{n}-inu}\,du

\displaystyle{\mathbb{E}}\left\{e^{isU_{n}}\bigm{|}T_{n}=n\right\}=\exp\Biggl{\{}\sum_{j=1}^{n}\frac{1}{j}\Biggr{\}}\frac{1}{2\pi}\int_{u=-\pi}^{\pi}{\mathbb{E}}\,e^{isU_{n}+iuT_{n}-inu}\,du

\displaystyle=\exp\Biggl{\{}\sum_{j=1}^{n}\frac{1}{j}\Biggr{\}}\frac{1}{2\pi}\int_{u=-\pi}^{\pi}\exp\Biggl{\{}-isb_{n}+\sum_{j=1}^{n}\frac{1}{j}\left(e^{iju+is/b_{n}}-1\right)-inu\Biggr{\}}\,du

\displaystyle=e^{-isb_{n}}\frac{1}{2\pi}\int_{u=-\pi}^{\pi}\exp\Biggl{\{}e^{is/b_{n}}\sum_{j=1}^{n}\frac{1}{j}e^{iju}-inu\Biggr{\}}\,du.

e^{- i s b_{n}} \frac{1}{2 π i} \int_{C} exp {δ_{n} j = 1 \sum n \frac{z ^{j}}{j}} z^{- n - 1} d z,

e^{- i s b_{n}} \frac{1}{2 π i} \int_{C} exp {δ_{n} j = 1 \sum n \frac{z ^{j}}{j}} z^{- n - 1} d z,

δ_{n} = e^{i s / b_{n}} .

δ_{n} = e^{i s / b_{n}} .

z \mapsto exp {δ_{n} j = 1 \sum n \frac{z ^{j}}{j}} .

z \mapsto exp {δ_{n} j = 1 \sum n \frac{z ^{j}}{j}} .

\displaystyle(-1)^{n}{-\delta_{n}\choose n}=\prod\left(1+\frac{\delta_{n}-1}{j}\right)=\exp\Biggl{\{}\sum_{j=1}^{n}\log\left(1+\frac{\delta_{n}-1}{j}\right)\Biggr{\}}.

\displaystyle(-1)^{n}{-\delta_{n}\choose n}=\prod\left(1+\frac{\delta_{n}-1}{j}\right)=\exp\Biggl{\{}\sum_{j=1}^{n}\log\left(1+\frac{\delta_{n}-1}{j}\right)\Biggr{\}}.

\displaystyle{\mathbb{E}}\left\{e^{isU_{n}}\bigm{|}T_{n}=n\right\}=\exp\Biggl{\{}-isb_{n}+\sum_{j=1}^{n}\log\left(1+\frac{\delta_{n}-1}{j}\right)\Biggr{\}}

\displaystyle{\mathbb{E}}\left\{e^{isU_{n}}\bigm{|}T_{n}=n\right\}=\exp\Biggl{\{}-isb_{n}+\sum_{j=1}^{n}\log\left(1+\frac{\delta_{n}-1}{j}\right)\Biggr{\}}

\displaystyle=\exp\Biggl{\{}-isb_{n}+\sum_{j=1}^{n}\log\left(1+\frac{e^{is/b_{n}}-1}{j}\right)\Biggr{\}}

\displaystyle=\exp\Biggl{\{}-isb_{n}+\sum_{j=1}^{n}\frac{e^{is/b_{n}}-1}{j}+o(1)\Biggr{\}}=\exp\Biggl{\{}-\frac{s^{2}}{2b_{n}^{2}}\sum_{j=1}^{n}\frac{1}{j}+o(1)\Biggr{\}}

= exp {- \frac{1}{2} s^{2} + o (1)} .

ϕ_{(V, W)} (t, u) = exp ⎩ ⎨ ⎧ \int_{0}^{1} \frac{e ^{- (\frac{1}{2} t^{2} - i u) y} - 1}{y} d y ⎭ ⎬ ⎫ .

ϕ_{(V, W)} (t, u) = exp ⎩ ⎨ ⎧ \int_{0}^{1} \frac{e ^{- (\frac{1}{2} t^{2} - i u) y} - 1}{y} d y ⎭ ⎬ ⎫ .

E exp {i t V_{n} + i u W_{n}} =

E exp {i t V_{n} + i u W_{n}} =

= exp {j = 1 \sum n \frac{e ^{ij (u / n - t n^{- 1/2})} ( 1 - i t n ^{- 1/2} ) ^{- j} - 1}{j}},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Grenander functionals and Cauchy’s formula

Piet Groeneboomlabel=e1][email protected] label=u1 [[

url]http://dutiosc.twi.tudelft.nl/~pietg

Delft University of Technology, Building 28, Van Mourik Broekmanweg 6, 2628 XE Delft, The Netherlands.

Abstract

Let $\hat{f}_{n}$ be the nonparametric maximum likelihood estimator of a decreasing density. Grenander characterized this in [6] as the left-continuous slope of the least concave majorant of the empirical distribution function. For a sample from the uniform distribution, the asymptotic distribution of the $L_{2}$ -distance of the Grenander estimator to the uniform density was derived in [12] by using a representation of the Grenander estimator in terms of conditioned Poisson and gamma random variables. This representation was also used in [8] to prove a central limit result of Sparre Andersen [19] on the number of jumps of the Grenander estimator. Here we extend this to the proof of the main result in [12] and also prove a similar asymptotic normality results for the entropy functional. In [12] the limit distribution of the sums of gamma and Poisson variables on which the conditioning was done did not have the right form, which is corrected here. Cauchy’s formula and saddle point methods are the main tools in our development.

62E20, ,

62G05,

Grenander estimator, integral statistics, saddle points, Cauchy’s formula,

keywords:

[class=AMS]

keywords:

\startlocaldefs\setattribute

journalname \endlocaldefs

t2This manuscript is dedicated to the memory of Ronald Pyke

1 Introduction

The Grenander estimator is the (nonparametric) maximum likelihood estimator (MLE) of a monotone decreasing density. It was introduced in [6], where it was proved that it is the left-continuous slope of the least concave majorant of the empirical distribution function. Some properties and limit results are discussed in [10] and also in [11] in the special issue on nonparametric inference under shape constraints of the journal Statistical Science. The Grenander estimator is shown in Figure 1 for a sample of size $n=100$ from the uniform distribution on $[0,1]$ . It can be improved by using boundary penalties (in fact, the estimator is inconsistent at the boundary points [math] and $1$ ), but this is not the concern of the present paper.

The Grenander estimator is a piecewise constant function with downward jumps at locations which correspond to the changes of slope (“kinks”) of the least concave majorant of the empirical distribution function. Although the Grenander estimator is defined as the left-continuous slope of the empirical distribution function, we can make the Grenander estimator right-continuous by taking the limits on the right at its points of jump. This does not change the probability mass of the induced (absolutely continuous) probability distribution, which is absolutely continuous w.r.t. Lebesgue measure.

The number of jumps of the Grenander estimator is of order $\log n$ if the sample is from a uniform distribution (see Section 2), if the sample comes from a strictly decreasing smooth density like the exponential density, then the number of jumps is of order $cn^{1/3}$ , for some constant $c>0$ . The limit behavior of the Grenander estimator for these situations is rather different. For a sample from the uniform distribution, we have for $t\in(0,1)$ :

[TABLE]

where $S_{t}$ is the slope of the least concave majorant of the standard Brownian Bridge on $[0,1]$ , see Remark 2.2, p. 543 of [7]. The density of $S_{t}$ is a function of the standard normal distribution function $\Phi$ and the standard normal density $\phi$ , see (3.11) in [9].

In contrast with the result (1.1), we get for a sample from a decreasing density $f$ on $[0,\infty)$ at a point $t\in(0,\infty)$ , where $f$ is differentiable and $f^{\prime}(t)<0$ , the following result, due to Prakasa Rao in [16]:

[TABLE]

where $Z=\mbox{argmax}_{t}\{W(t)-t^{2}\}$ , that is: $Z$ is the (almost surely unique) location of the maximum of two-sided Brownian motion minus the parabola $y(t)=t^{2}$ . For further details, see, e.g., [10] and [11].

Recently, integrated functionals of a monotone density were studied in [14]. Using the same notation as in [14] the following functionals were studied:

[TABLE]

where $f$ is a nonincreasing function on $\mathbb{R}_{+}$ and $h$ satisfies some regularity conditions. In the case that the underlying distribution is uniform, the following central limit result is proved:

Theorem 1.1 (Theorem 3.2 in [14]).

Let $h\in C^{4}([0,\infty))$ and let $h^{\prime\prime}(1)\neq 0$ . Then:

[TABLE]

where $N(0,1)$ denotes the standard normal distribution.

We prove an analogous result for analytic functions $h$ , defined on the positive open complex half plane. So our functions $h$ have a lot more smoothness, but are on the other hand defined on the open complex half plane, which makes the result applicable to functions that are not covered by the conditions in [14]. We assume that $h$ satisfies the following condition.

Condition 1.1.

The function $h$ is analytic on the complex half plane $\{z\in\mathbb{C}:\text{Re}(z)>0\}$ , and satisfies the following conditions.

(i)

$h^{\prime\prime}(1)\neq 0$ . 2. (ii)

For $t\in\mathbb{R}$ :

[TABLE]

and 3. (iii)

For $t\in\mathbb{R}$ :

[TABLE]

We now have the following result.

Theorem 1.2.

Let $h$ satisfy Condition 1.1 and let $\hat{f}_{n}$ be the Grenander estimator. Then

[TABLE]

where $N(0,1)$ denotes the standard normal distribution.

The result is a corollary to Theorem 4.1 in Section 4, see the remark at the end of Section 4.

Examples of the application of Theorem 1.2 are:

Example 1.1.

Let $h(z)=(z-1)^{2}$ . Then $h^{\prime}(z)=2(z-1)$ and $h^{\prime\prime}(z)=2$ . The function $h$ is obviously analytic on the positive complex half plane. Condition 1.1 is fulfilled, so we get

[TABLE]

This is the main result of [12]. Since in [14] Theorem 3.2 is deduced from this result, we also get Theorem 3.2 in [14] back from Theorem 1.2. **

Example 1.2.

Let $h(z)=z\log z$ . Then $h^{\prime}(z)=1+\log(z)$ and $h^{\prime\prime}(z)=1/z$ . The function $h$ is again analytic on the positive complex half plane. Condition 1.1 is fulfilled, and we get:

[TABLE]

For this example the conditions of Theorem 3.2 in [14] are not satisfied. The result follows from Theorem 1.2 and can be applied to the theory on a likelihood ratio test for monotonicity in [2]. **

To derive limit results for the uniform distribution, a special representation in terms of gamma and Poisson random variables was given in [12], with the purpose of proving a limit result for a two-sample rank statistic introduced in the dissertation of [1] and also for a test statistic in the combination of tests in [18]. We describe this representation now.

Let $X_{1},\dots,X_{n}$ be a sample from the uniform distribution, and let $0=\xi_{n0}<\xi_{n1}<\dots<\xi_{nm}<\xi_{n,m+1}=1$ be the locations of the jumps of the Grenander estimator $\hat{f}_{n}$ for this sample, augmented with the points [math] and $1$ . Note that $[\xi_{n,0},\xi_{n,1}]$ , $(\xi_{n,1},\xi_{n2}]$ , $(\xi_{n,2},\xi_{n3}]$ , $\dots$ , $(\xi_{n,m},1]$ are the successive intervals of constancy of the Grenander estimator if we take the estimator to be left-continuous.

Furthermore, let $D_{ni}$ , $J_{ni}$ and $Q_{nj}$ be defined by:

[TABLE]

where ${\mathbb{F}}_{n}$ is the empirical distribution function of the sample $X_{1},\dots,X_{n}$ , and where $m$ is the number of jumps of the Grenander estimator.

Next, let $\{N_{j}:j\geq 1\}$ be independent Poisson random variables with ${\mathbb{E}}N_{j}=1/j$ , and let, for each $i$ , $\{S_{ji},\,i,j\geq 1\}$ be a collection of independent gamma random variables, independent of the $N_{j}$ , where $S_{ji}$ is Gamma $(j,1)$ (the sum of $j$ independent standard exponentials). We define:

[TABLE]

and

[TABLE]

Note that there are $N_{1}$ induced spacings $\xi_{ni}$ (intervals of constancy of $\hat{f}_{n}$ ) of length $1$ , $N_{2}$ induced spacings $\xi_{ni}$ (intervals of constancy of $\hat{f}_{n}$ ), consisting of two consecutive original spacings between locations of jumps of the least concave majorant, etc., where $N_{j}$ can be zero.

We now have the following representation theorem:

Theorem 1.3 (Theorem 2.1 in [12]).

[TABLE]

Remark 1.1.

For specificity, the random variables $S_{ji},\,i=1,\dots,N_{j}$ , and $D_{ji},\,i=1,\dots Q_{nj}$ , are ordered in Theorem 2.1 in [12]. There is also a zero-step spacing introduced in [12], but this does not seem to be necessary. **

Using this representation, we can reduce the proofs of the limit behavior of global functionals of the Grenander estimator to a theorem for gamma and Poisson random variables, under the condition $(T_{n},S_{n})=(n,n)$ . For convenience in later proofs, we also use a further standardization of $(S_{n},T_{n})$ :

[TABLE]

where $S_{n}$ and $T_{n}$ are defined by (1.9). A conditional central limit theorem for functionals of the Grenander estimator can then be proved under the condition:

[TABLE]

The infinitely divisible limit distribution of the pair $(V_{n},W_{n})$ was given in Lemma 3.1 of [12], but unfortunately Lemma 3.1 of [12] contains a rather silly error (the $u^{2}$ in the representation of the characteristic function should be $u$ ). The correct version of this result is given in Lemma 3.1 below, where also the origin of the error is explained. The proof in [8] does not use the result on the limit distribution of $(V_{n},W_{n})$ , so is not influenced by the erroneous Lemma 3.1 in [12].

We use methods different from those in [12]. The conditional central limit theorem was proved in [12] using Le Cam’s paper [13] (a paper apparently published without his permission, as became clear in a conversation of the author with him). In the present context, where we clearly have to deal with non-standard asymptotics, [13] does not seem to be the right tool to use. We replace this by a direct analysis of the characteristic function. The crucial tools here are Cauchy’s formula and saddle point methods, using contour integration in the complex plane. To illustrate our method in a simple setting, we give a shortened version of the proof in [8] of Sparre Andersen’s result [19] in Section 2.

2 Sparre Andersen’s result

To illustrate our method in the simplest setting, we give a short version of the proof in [8] of the following remarkable result of Sparre Andersen in [19].

Theorem 2.1 (Sparre Andersen’s result).

Let $X_{1},\dots,X_{n}$ be a sample from the Uniform $(0,1)$ distribution and let $N_{jumps}$ be the number of jumps of the Grenander estimator for this sample. Then

[TABLE]

where $N(0,1)$ is the standard normal distribution.

Proof.

Let, for the sample $X_{1},\dots,X_{n}$ , $U_{n}$ be defined by:

[TABLE]

and let $T_{n}$ be defined as in (1.9). Using (part of) the representation, introduced in Section 1, we only have to prove that $(U_{n}|T_{n}=n)$ tends in law to a standard normal distribution. To this end we consider the conditional characteristic function

[TABLE]

Lemma 3.2 of [12] implies:

[TABLE]

Hence we get, by Fourier inversion and using the notation $b_{n}=\sqrt{\log n}$ ,

[TABLE]

Denoting the contour $u\mapsto e^{iu},\,u\in[-\pi,\pi),$ by $C$ , we write this in the form

[TABLE]

where $\delta_{n}$ is given by:

[TABLE]

The expression in (2.1), multiplied by $e^{isb_{n}}$ , is by Cauchy’s formula equal to the coefficient of $z^{n}$ in the power series around $z=0$ of the function:

[TABLE]

Comparing this with the power series of the function $z\mapsto(1-z)^{-\delta_{n}}$ , we see that the coefficient of $z^{n}$ is the same in both series. This coefficient is:

[TABLE]

So we get:

[TABLE]

∎

3 The limit distribution of the conditioning variables $(V_{n},W_{n})$

Let the pair $(V_{n},W_{n})$ be defined by (1.10). We prove the following lemma, which corrects Lemma 3.1 in [12].

Lemma 3.1.

The pair $(V_{n},W_{n})$ converges in distribution to $(V,W)$ , where $(V,W)$ has the infinitely divisible characteristic function

[TABLE]

Proof.

We have:

[TABLE]

where $\phi_{S_{j1}-j}$ is the characteristic function of the centered gamma variable $S_{j1}-j$ , see (3.9) of [12].

Writing $y_{j,n}=j/n$ , and noting that for $y\in(0,1)$ :

[TABLE]

and that the limit of the expression on the left for $y\downarrow 0$ is equal to:

[TABLE]

we can write the exponent in the form

[TABLE]

(it is here that the mistake was made in [12], in the formula after (3.9) on p. 333), so we get a Riemann sum converging to the integral

[TABLE]

The infinite divisibility of the limit distribution is shown below. ∎

Remark 3.1.

In [12] first the limit distribution of $U_{n}$ is computed, using moment conditions (going up to the $8$ th moments). Next the limit distribution of $(V_{n},W_{n})$ is computed and it is stated that this distribution is infinite divisible and has no normal component, implying that therefore the limit $(V_{n},W_{n})$ has to be independent of the limit of $U_{n}$ .

The $s^{2}u^{2}$ in the exponent of the characteristic function of the limit distribution of $(V_{n},W_{n})$ in Lemma 3.1 of [12] should be $s^{2}u$ . The incorrect $u^{2}$ arose on p. 333 of [12], where the limit of the characteristic function of the rescaled gamma random variable $(S_{j}-j)/\sqrt{n}$ was given by $\exp\{-s^{2}u^{2}/2\}$ instead of $\exp\{-s^{2}u/2\}$ . This also invalidates the ensuing remarks on p. 333 of [12]. We correct these remarks below. **

The distribution of $(V,W)$ is infinitely divisible, as we now show. A general characterization of infinitely divisible distributions in $\mathbb{R}^{d}$ is given in [17] and given below for convenience.

Theorem 3.1 (Theorem 1.3 in [17], Lévy-Khintchine representation).

If the distribution $\mu$ is infinitely divisible, then its characteristic function $\hat{\mu}(\bm{s})=\int_{\mathbb{R}^{d}}\exp\{i\langle\bm{s},\bm{z}\rangle\}\,d\mu(\bm{z})$ is given by:

[TABLE]

where $\bm{A}$ is a symmetric nonnegative-definite $d\times d$ matrix, $\|\cdot\|$ is the Euclidean norm, $\nu$ is a measure on $\mathbb{R}^{d}$ satisfying $\nu(\{0\})=0$ , $\int_{\mathbb{R}^{d}}\left(\|\bm{z}\|^{2}\wedge 1\right)\,d\nu(\bm{z})<\infty$ , and where $\bm{\delta}\in\mathbb{R}^{d}$ . The representation (3.2) by $\bm{A},\nu$ and $\bm{\delta}$ is unique. Conversely, for any choice of $\bm{A},\nu$ and $\bm{\delta}$ satisfying the conditions above, there exists an infinite divisible distribution $\mu$ having characteristic function (3.2).

In the present situation we can take $\bm{A}$ the $2\times 2$ matrix with zeroes, $\bm{\delta}=(0,0)^{T}$ and define $\nu$ by the density

[TABLE]

where $\phi$ is the standard normal density. With these choices of $\bm{A}$ , $\bm{\delta}$ and $\nu$ we get, using the notation $\bm{s}=(t,u)^{T}$ and $\bm{z}=(v,w)^{T}$ ,

[TABLE]

We note that the computer package Mathematica evaluates the characteristic function for $(V,W)$ in the following way:

[TABLE]

where $\gamma$ is Euler’s gamma and $\Gamma(0,\tfrac{1}{2}t^{2}-iu)$ is the complementary incomplete gamma function, defined by:

[TABLE]

see, e.g., (2.01) on p. 109 of [15].

4 Central limit theorem for $\int h(\hat{f}_{n}(x))\,dx$

In this section, we use the notation

[TABLE]

Using the conditioning of Section 1, the statistic $\int h(\hat{f}_{n}(x))\,dx$ has the following representation:

[TABLE]

where the Poisson random variables $N_{j}$ and the gamma random variables $S_{ji}$ are defined as in (1.9), and where we condition on $(V_{n},W_{n})=(0,1)$ , where $(V_{n},W_{n})$ is defined by (1.10). We define

[TABLE]

Remark 4.1.

The terms $h^{\prime}(1)\left(S_{ji}-j\right)$ are present in $U_{n}$ as variance reducing terms and give, after the summation over $i$ and $j$ , a zero contribution to $U_{n}$ if $(V_{n},W_{n})=(0,1)$ . Also note that

[TABLE]

if the condition $(V_{n},W_{n})=(0,1)$ is satisfied. **

We assume that the function $h$ satisfies condition 1.1 and first consider the conditional density of $V_{n}$ , given $W_{n}=1$ .

Lemma 4.1.

The conditional density of $V_{n}$ , given $W_{n}=1$ , is the density of a centered and standardized Gamma $(n,1)$ variable:

[TABLE]

The density $f_{V_{n}|W_{n}=1}(x)$ converges uniformly to the standard normal density, as $n\to\infty$ .

Proof.

By Fourier inversion, the conditional characteristic function is given by:

[TABLE]

where $c_{n}=\sqrt{n}$ , see (3). Denoting the contour $w\mapsto e^{iw},\,w\in[-\pi,\pi),$ by $C$ , we write this in the form

[TABLE]

where $\beta_{n}(t)$ is given by:

[TABLE]

and where we use:

[TABLE]

by Lemma 3.2 of [12]. An application of Cauchy’s formula yields:

[TABLE]

where we use that the coefficient of $z^{n}$ in the power series around $z=0$ of the function $z\mapsto\exp\{\sum_{j=1}^{n}(\beta_{n}(t)z)^{j}/j\}$ is the same as the coefficient of $z^{n}$ in the power series of the function $z\mapsto(1-\beta_{n}(t)z)^{-n}$ .

Hence

[TABLE]

This is just the characteristic function of the sum of $n$ standardized exponential variables, and hence its density tends uniformly to the standard normal density by Theorem 2 on p. 516 of [5]. ∎

So, in particular, we get:

[TABLE]

where $\gamma$ is Euler’s gamma. The characteristic function of $(U_{n}|V_{n}=0,W_{n}=1)$ is therefore given by

[TABLE]

We now consider the characteristic function $\phi_{nj}$ , defined by:

[TABLE]

which involves the components of the random variables $U_{n}$ and $V_{n}$ . Its asymptotic behavior is determined using a saddle point method.

Lemma 4.2.

The characteristic function (4.5) satisfies

[TABLE]

uniformly for $t$ in a bounded interval.

Proof.

After a change of variables, $\phi_{nj}(s,t)$ can be written:

[TABLE]

where $f_{n,s,t}$ is defined on the right half plane by:

[TABLE]

The derivative of $f_{n,s,t}$ is given by

[TABLE]

A saddle point is given by the equation

[TABLE]

Multiplying both sides of this equation with $z$ , we get the equation

[TABLE]

This equation has, for sufficiently large $n$ , a unique solution in a neighborhood of $z_{0}(t)=\left(1-\frac{it}{c_{n}}\right)^{-1}$ , as is clear from the following properties.

(i)

[TABLE] 2. (ii)

For $z_{0}(t)=\left(1-\frac{it}{c_{n}}\right)^{-1}$ we have:

[TABLE]

see, e.g., (3.3), p. 56 of [4]. So the saddle point is given by a solution of the equation (4.9) and can be found by the simple iteration $z_{k+1}=g_{n}(z_{k})$ , starting at $z_{0}(t)$ . We can, however, also take $z_{0}(t)$ itself instead of the real saddle point for the asymptotic expansion, since this gives us the same terms in the expansion we need.

The value of $f_{n,s,t}$ at $z_{0}(t)$ has the following expansion:

[TABLE]

Furthermore,

[TABLE]

so

[TABLE]

It follows that we get:

[TABLE]

where $\alpha_{n}=\exp\left\{ish^{\prime\prime}(1)/(2b_{n})\right\}$ is a complex number with absolute value 1, corresponding to the argument of the main axis of the saddle point (note that the argument of this axis is $\tfrac{1}{2}\pi-\tfrac{1}{2}\arg f_{n,s,t}^{\prime\prime}(z_{0}(t))$ , see [3], p. 84).

Evaluating the integrand at $z_{0}(t)$ , and applying Stirling’s formula on $\Gamma(j)$ , we obtain the following asymptotic representation:

[TABLE]

see, e.g., [3], (5.10.3) on p. 92 for the first asymptotic equivalence. The second asymptotic equivalence holds uniformly for $t$ in a bounded interval. Note that this corresponds to changing the path of integration for $x$ to a path in the complex plane, going through the saddle point. ∎

We can now prove the following property of the characteristic function of $(U_{n},V_{n},W_{n})$ . This is “almost” the Fourier inversion for $(V_{n},W_{n})$ , but we still have to extend the inversion for $V_{n}$ to the whole real line. Cauchy’s formula is an essential ingredient of the proof of Lemma 4.3.

Lemma 4.3.

Let $U_{n}$ be defined by (4.2) and let $h$ satisfy condition 1.1. Then, for each $M>0$ :

[TABLE]

where $\gamma$ is Euler’s gamma.

Proof.

Let $\phi_{nj}$ be defined by (4.5). Since, by (4.1) and (4.2),

[TABLE]

where

[TABLE]

we get, evaluating the probabilities for the Poisson random variables $N_{j}$ in the third line:

[TABLE]

As in the proof of Lemma 4.1 we consider the contour $w\mapsto e^{iw},\,w\in[-\pi,\pi),$ and denote this contour by $C$ . So, integrating (4) w.r.t. $u$ and changing variables we get:

[TABLE]

where

[TABLE]

and

[TABLE]

and where $\alpha_{n}$ is a complex number of absolute value 1, corresponding to the angle of the main axis through the (approximate) saddle point $z_{0}(t)$ .

So we get, by Cauchy’s formula, for $t\in[-M,M]$ and $M$ arbitrarily large,

[TABLE]

Thus:

[TABLE]

∎

We still have to prove that the remaining part of integral w.r.t. the integration variable $t$ can be made arbitrarily small by choosing $M$ large. To this end we split the remaining region into two regions: $A_{1}=\{t\in\mathbb{R}:M<|t|\leq\delta n^{1/2}\}$ and $A_{2}=\{t\in\mathbb{R}:|t|>\delta n^{1/2}\}$ . This split-up is familiar from inversion theorems for densities, see, e.g., the proof of Theorem 2 on p. 516 of [5]. We start with the region $A_{1}=\{t\in\mathbb{R}:M<|t|\leq\delta n^{1/2}\}$ .

Lemma 4.4.

Let $h$ satisfy condition 1.1. Then there exists for each $\varepsilon>0$ an $M>0$ and $\delta>0$ such that

[TABLE]

Proof.

We consider again the expansion of the function $f_{n,s,t}$ defined by (4.7) at the point $z_{0}(t)=(1-it/c_{n})^{-1}$ . We get:

[TABLE]

where $\theta$ is a point on the line segment between $1$ and $1/(1-it/c_{n})$ . Likewise

[TABLE]

where $\theta^{\prime}$ is a point on the line segment between $1$ and $1/(1-it/c_{n})$ .

So we have a local expansion

[TABLE]

as in (4). This means that we can follow the same steps as in the proof of Lemma 4.3 and that we can choose $M$ and $\delta>0$ in such a way that

[TABLE]

∎

The following lemma deals with the region $A_{2}=\{t\in\mathbb{R}:|t|>\delta n^{1/2}\}$ .

Lemma 4.5.

Let $h$ satisfy condition 1.1. Then, for each $\delta>0$ :

[TABLE]

Proof.

We consider the characteristic function:

[TABLE]

so we replace $c_{n}$ by $1$ in (4.5). This means that, for the saddle point analysis, the constant $c_{n}$ is replaced by $1$ in the function (4.7). So we now define

[TABLE]

The saddle point equation (4.9) now turns into

[TABLE]

and has a unique solution in a neighborhood of $(1-it)^{-1}$ for the same reasons as before.

We define:

[TABLE]

Then:

[TABLE]

and

[TABLE]

uniformly in $|t|>\delta$ , using Condition 1.1. This implies that, uniformly for $|t|>\delta$ ,

[TABLE]

where $\alpha_{n}(t)$ is a complex number with absolute value 1 and argument $\tfrac{1}{2}\pi-\tfrac{1}{2}\arg\bar{f}_{n,s,t}^{\prime\prime}(z_{0}(t))$ .

So we find, using Cauchy’s formula again, as in the proof of Lemma 4.3, and using Condition 1.1,

[TABLE]

where

[TABLE]

and

[TABLE]

Hence

[TABLE]

implying

[TABLE]

for large $n$ , uniformly for $|t|>\delta$ , using Condition 1.1. It now follows that

[TABLE]

∎

This leads to the main result of this section.

Theorem 4.1.

Let $h$ satisfy Condition 1.1. Then $U_{n}|(V_{n},W_{n})=(0,1)$ converges in law to a standard normal distribution.

Proof.

The preceding lemma’s imply

[TABLE]

where $\gamma$ is Euler’s gamma. Hence

[TABLE]

∎

Theorem 1.2 now follows from the conditional representation from Section 1, definition (4.2) of $U_{n}$ , Remark 4.1 and Theorem 4.1.

5 Conclusion

We derived a general theorem (Theorem 1.2) for integrals of the Grenander estimator when the distribution is uniform from a representation in terms of Poisson and gamma random variables in [12]. The result implies the main result of [12] and gives also the limit behavior of the entropy functional. We corrected the limit distribution of the conditioning variables given in Lemma 3.1 of [12] in Section 3. The methods used are rather different from the methods in [12], where a result in [13] was used.

Our main result was inspired by [14] who derived a similar result from [12] under different conditions. The main tools are Cauchy’s formula and the saddle point method for integrals of analytic functions of a complex variable. A simple version of the approach is given in Section 2 to illustrate the method without the complications of the saddle point method.

Bibliography19

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Behnen [1974] K. Behnen. Güteeigenschaften von Rangtests unter Bindungen . Habilitationsschrift, University of Freiburg, 1974.
2Chan et al. [2018] K. C. G. Chan, C-F Tang, and S.C.P Yam. Likelihood ratio test for monotonicity of density. Submitted, 2018.
3de Bruijn [1981] N. G. de Bruijn. Asymptotic methods in analysis . Dover Publications, Inc., New York, third edition, 1981. ISBN 0-486-64221-6.
4Dieudonné [1968] Jean Dieudonné. Calcul infinitésimal . Hermann, Paris, 1968.
5Feller [1971] William Feller. An introduction to probability theory and its applications. Vol. II . Second edition. John Wiley & Sons, Inc., New York-London-Sydney, 1971.
6Grenander [1956] Ulf Grenander. On the theory of mortality measurement. II. Skand. Aktuarietidskr. , 39:125–153 (1957), 1956.
7Groeneboom [1985] P. Groeneboom. Estimating a monotone density. In Proceedings of the Berkeley conference in honor of Jerzy Neyman and Jack Kiefer, Vol. II (Berkeley, Calif., 1983) , Wadsworth Statist./Probab. Ser., pages 539–555, Belmont, CA, 1985. Wadsworth.
8Groeneboom and Lopuhaä [1993] P. Groeneboom and H. P. Lopuhaä. Isotonic estimators of monotone densities and distribution functions: basic facts. Statist. Neerlandica , 47:175–183, 1993. ISSN 0039-0402. 10.1111/j.1467-9574.1993.tb 01415.x . URL http://dx.doi.org/10.1111/j.1467-9574.1993.tb 01415.x . · doi ↗

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Grenander functionals and Cauchy’s formula

Abstract

keywords:

keywords:

1 Introduction

Theorem 1.1** (Theorem 3.2 in [14]).**

Condition 1.1**.**

Theorem 1.2**.**

Example 1.1**.**

Example 1.2**.**

Theorem 1.3** (Theorem 2.1 in [12]).**

Remark 1.1**.**

2 Sparre Andersen’s result

Theorem 2.1** (Sparre Andersen’s result).**

Proof.

3 The limit distribution of the conditioning variables (Vn,Wn)(V_{n},W_{n})(Vn​,Wn​)

Lemma 3.1**.**

Proof.

Remark 3.1**.**

Theorem 3.1** (Theorem 1.3 in [17], Lévy-Khintchine representation).**

4 Central limit theorem for ∫h(f^n(x)) dx\int h(\hat{f}_{n}(x))\,dx∫h(f^​n​(x))dx

Remark 4.1**.**

Lemma 4.1**.**

Proof.

Lemma 4.2**.**

Proof.

Lemma 4.3**.**

Proof.

Lemma 4.4**.**

Proof.

Lemma 4.5**.**

Proof.

Theorem 4.1**.**

Proof.

5 Conclusion

Theorem 1.1 (Theorem 3.2 in [14]).

Condition 1.1.

Theorem 1.2.

Example 1.1.

Example 1.2.

Theorem 1.3 (Theorem 2.1 in [12]).

Remark 1.1.

Theorem 2.1 (Sparre Andersen’s result).

3 The limit distribution of the conditioning variables $(V_{n},W_{n})$

Lemma 3.1.

Remark 3.1.

Theorem 3.1 (Theorem 1.3 in [17], Lévy-Khintchine representation).

4 Central limit theorem for $\int h(\hat{f}_{n}(x))\,dx$

Remark 4.1.

Lemma 4.1.

Lemma 4.2.

Lemma 4.3.

Lemma 4.4.

Lemma 4.5.

Theorem 4.1.