Easy Maximum Empirical Likelihood Estimation of Linear Functionals Of A   Probability Measure With Infinitely Many Constraints

Shan Wang; Hanxiang Peng

arXiv:2302.14768·stat.ME·March 1, 2023

Easy Maximum Empirical Likelihood Estimation of Linear Functionals Of A Probability Measure With Infinitely Many Constraints

Shan Wang, Hanxiang Peng

PDF

Open Access

TL;DR

This paper introduces a simple empirical likelihood method for efficiently estimating linear functionals of a probability measure with infinitely many constraints, applicable in various informational settings.

Contribution

It develops an easy empirical likelihood estimator that handles infinitely many constraints and different types of side information, improving estimation efficiency.

Findings

01

The estimator achieves semiparametric efficiency.

02

Simulation results show significant efficiency gains.

03

The method applies to known marginals, unknown identical marginals, and symmetric distributions.

Abstract

In this article, we construct semiparametrically efficient estimators of linear functionals of a probability measure in the presence of side information using an easy empirical likelihood approach. We use estimated constraint functions and allow the number of constraints to grow with the sample size. Considered are three cases of information which can be characterized by infinitely many constraints: (1) the marginal distributions are known, (2) the marginals are unknown but identical, and (3) distributional symmetry. An improved spatial depth function is defined and its asymptotic properties are studied. Simulation results on efficiency gain are reported.

Tables5

Table 1. Table 1: Simulated maximal eigenvalues λ ~ ~ 𝜆 \tilde{\lambda} and λ 𝜆 \lambda of the variance-covariance matrices of the EL-weighted and sample spatial medians with data generated from a few distributions in the presence of known componentwise medians ( 0 , 0 ) 0 0 (0,0) and ( 0 , 0 , 0 ) 0 0 0 (0,0,0) .

Cauchy
	dim=2			dim=3
$n$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$
50	0.0641	0.0094	0.1463	0.0830	0.0111	0.1339
100	0.0330	0.0040	0.1209	0.0379	0.0048	0.1260
200	0.0157	0.0018	0.1162	0.0207	0.0024	0.1153
500	0.0060	0.0007	0.1174	0.0078	0.0009	0.1181
Student $t$ (df=3)
	dim=2			dim=3
$n$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$
50	0.0432	0.0064	0.1477	0.0609	0.0083	0.1363
100	0.0218	0.0029	0.1322	0.0281	0.0037	0.1329
200	0.0119	0.0014	0.1161	0.0145	0.0017	0.1198
500	0.0046	0.0005	0.1096	0.0055	0.0007	0.1257
Copula distribution with marginals $N (0, 1)$ & $t (3)$
	dim=2			dim=3
$n$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$
50	0.0523	0.0051	0.0972	0.0542	0.0082	0.1515
100	0.0262	0.0021	0.0790	0.0278	0.0036	0.1285
200	0.0131	0.0009	0.0715	0.0135	0.0017	0.1291
500	0.0054	0.0004	0.0679	0.0055	0.0007	0.1235
Asymmetric Laplace
	dim=2			dim=3
$n$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$
50	0.0153	0.0021	0.1360	0.0191	0.0024	0.1248
100	0.0072	0.0009	0.1213	0.0090	0.0011	0.1209
200	0.0035	0.0004	0.1141	0.0043	0.0005	0.1155
500	0.0013	0.0001	0.1139	0.0017	0.0002	0.1072

Table 2. Table 2: Same as Table 1 except for data generated from 3 3 3 -dimensional Cauchy in the presence of one marginal distribution symmetric about the origin.

One marginal known
	$m = 1$			$m = 3$			$m = 5$
$n$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$
50	0.0783	0.0499	0.6366	0.0842	0.0527	0.6261	0.0773	0.0525	0.6790
100	0.0399	0.0250	0.6277	0.0381	0.0228	0.5989	0.0382	0.0237	0.6213
200	0.0189	0.0119	0.6268	0.0195	0.0116	0.5976	0.0190	0.0116	0.6093
500	0.0074	0.0046	0.6184	0.0082	0.0045	0.5530	0.0075	0.0045	0.6012
One marginal unknown
	$m = 1$			$m = 3$			$m = 5$
$n$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$
50	0.0776	0.0518	0.6679	0.0853	0.0533	0.6249	0.0778	0.0532	0.6841
100	0.0385	0.0234	0.6082	0.0404	0.0243	0.6016	0.0406	0.0241	0.5931
200	0.0204	0.0117	0.5720	0.0203	0.0122	0.6020	0.0197	0.0109	0.5538
500	0.0073	0.0045	0.6087	0.0079	0.0049	0.6158	0.0078	0.0044	0.5595

Table 3. Table 3: Same as Table 2 except for data generated from t ( 3 ) 𝑡 3 t(3)

One marginal known
	$m = 1$			$m = 3$			$m = 5$
$n$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$
50	0.0581	0.0384	0.6602	0.0576	0.0357	0.6199	0.0602	0.0385	0.6403
100	0.0292	0.0177	0.6069	0.0262	0.0171	0.6519	0.0274	0.0169	0.6178
200	0.0149	0.0092	0.6204	0.0142	0.0088	0.6209	0.0136	0.0085	0.6239
500	0.0055	0.0036	0.6453	0.0057	0.0034	0.5973	0.0056	0.0033	0.5830
One marginal unknown
	$m = 1$			$m = 3$			$m = 5$
$n$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$
50	0.0554	0.0359	0.6481	0.0561	0.0352	0.6278	0.0575	0.0373	0.6477
100	0.0291	0.0184	0.6314	0.0288	0.0171	0.5921	0.0286	0.0174	0.6072
200	0.0150	0.0096	0.6410	0.0136	0.0087	0.6344	0.0141	0.0086	0.6053
500	0.0056	0.0034	0.6112	0.0057	0.0033	0.5732	0.0057	0.0032	0.5622

Table 4. Table 4: Same as Table 2 except for data generated from the copula with two N ( 0 , 1 ) 𝑁 0 1 N(0,1) and one t ( 3 ) 𝑡 3 t(3) marginals.

One marginal known
	$m = 1$			$m = 3$			$m = 5$
$n$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$
50	0.0545	0.0366	0.6713	0.0541	0.0357	0.6601	0.0527	0.0373	0.7066
100	0.0273	0.0181	0.6626	0.0265	0.0183	0.6921	0.0269	0.0174	0.6470
200	0.0132	0.0088	0.6665	0.0142	0.0085	0.6005	0.0137	0.0082	0.5995
500	0.0054	0.0039	0.7113	0.0053	0.0033	0.6208	0.0053	0.0033	0.6162
One marginal unknown
	$m = 1$			$m = 3$			$m = 5$
$n$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$
50	0.0562	0.0385	0.6841	0.0519	0.0348	0.6698	0.0543	0.0364	0.6707
100	0.0275	0.0193	0.7021	0.0272	0.0172	0.6321	0.0267	0.0172	0.6430
200	0.0127	0.0089	0.7012	0.0131	0.0085	0.6495	0.0129	0.0082	0.6324
500	0.0054	0.0035	0.6427	0.0055	0.0036	0.6451	0.0052	0.0032	0.6187

Table 5. Table 5: Same as Table 2 except for data generated from 3 3 3 -dimensional Asymmetric Laplace Distribution.

One marginal known
	$m = 1$			$m = 3$			$m = 5$
$n$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$
50	0.0189	0.0113	0.6006	0.0190	0.0114	0.6004	0.0181	0.0115	0.6335
100	0.0090	0.0053	0.5845	0.0093	0.0051	0.5464	0.0089	0.0051	0.5760
200	0.0044	0.0026	0.5962	0.0042	0.0023	0.5547	0.0042	0.0023	0.5549
500	0.0016	0.0009	0.5867	0.0016	0.0009	0.5627	0.0016	0.0009	0.5525
One marginal unknown
	$m = 1$			$m = 3$			$m = 5$
$n$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$	$\tilde{λ}$	$λ$	$\tilde{λ} / λ$
50	0.0178	0.0118	0.6656	0.0196	0.0118	0.6023	0.0186	0.0130	0.6966
100	0.0085	0.0056	0.6565	0.0087	0.0055	0.6346	0.0083	0.0051	0.6162
200	0.0042	0.0027	0.6504	0.0043	0.0025	0.5757	0.0044	0.0025	0.5792
500	0.0017	0.0011	0.6342	0.0017	0.0010	0.5748	0.0017	0.0010	0.5539

Equations206

\mbox{\boldmath$\tilde{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}=\frac{1}{n}\sum_{j=1}^{n}\frac{\mbox{\boldmath$\psi$\unboldmath}(Z_{j})}{1+{\mathbf{u}}(Z_{j})^{\top}\mbox{\boldmath$\tilde{\zeta}$\unboldmath}},

\mbox{\boldmath$\tilde{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}=\frac{1}{n}\sum_{j=1}^{n}\frac{\mbox{\boldmath$\psi$\unboldmath}(Z_{j})}{1+{\mathbf{u}}(Z_{j})^{\top}\mbox{\boldmath$\tilde{\zeta}$\unboldmath}},

\sum_{j=1}^{n}\frac{{\mathbf{u}}(Z_{j})}{1+{\mathbf{u}}(Z_{j})^{\top}\mbox{\boldmath$\zeta$\unboldmath}}=0.

\sum_{j=1}^{n}\frac{{\mathbf{u}}(Z_{j})}{1+{\mathbf{u}}(Z_{j})^{\top}\mbox{\boldmath$\zeta$\unboldmath}}=0.

\hat{\pi}_{j}=\frac{1}{n}\frac{1}{1+\hat{\mathbf{u}}(Z_{j})^{\top}\mbox{\boldmath$\hat{\zeta}$\unboldmath}},\quad j=1,\dots,n,

\hat{\pi}_{j}=\frac{1}{n}\frac{1}{1+\hat{\mathbf{u}}(Z_{j})^{\top}\mbox{\boldmath$\hat{\zeta}$\unboldmath}},\quad j=1,\dots,n,

\mbox{\boldmath$\hat{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}=\sum_{j=1}^{n}\hat{\pi}_{j}\mbox{\boldmath$\psi$\unboldmath}(Z_{j})=\frac{1}{n}\sum_{j=1}^{n}\frac{\mbox{\boldmath$\psi$\unboldmath}(Z_{j})}{1+\hat{\mathbf{u}}(Z_{j})^{\top}\mbox{\boldmath$\hat{\zeta}$\unboldmath}}.

\mbox{\boldmath$\hat{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}=\sum_{j=1}^{n}\hat{\pi}_{j}\mbox{\boldmath$\psi$\unboldmath}(Z_{j})=\frac{1}{n}\sum_{j=1}^{n}\frac{\mbox{\boldmath$\psi$\unboldmath}(Z_{j})}{1+\hat{\mathbf{u}}(Z_{j})^{\top}\mbox{\boldmath$\hat{\zeta}$\unboldmath}}.

u_{n} = (u_{1}, \dots, u_{m_{n}})^{⊤}, \hat{u}_{n} = (\overset{u}{^}_{1}, \dots, \overset{u}{^}_{m_{n}})^{⊤},

u_{n} = (u_{1}, \dots, u_{m_{n}})^{⊤}, \hat{u}_{n} = (\overset{u}{^}_{1}, \dots, \overset{u}{^}_{m_{n}})^{⊤},

\mbox{\boldmath$\tilde{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}_{n}=\frac{1}{n}\sum_{j=1}^{n}\frac{\mbox{\boldmath$\psi$\unboldmath}(Z_{j})}{1+{\mathbf{u}}_{n}(Z_{j})^{\top}\tilde{}\mbox{\boldmath$\zeta$\unboldmath}_{n}}\quad\mbox{and}\quad\mbox{\boldmath$\hat{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}_{n}=\frac{1}{n}\sum_{j=1}^{n}\frac{\mbox{\boldmath$\psi$\unboldmath}(Z_{j})}{1+\hat{\mathbf{u}}_{n}(Z_{j})^{\top}\hat{}\mbox{\boldmath$\zeta$\unboldmath}_{n}},

\mbox{\boldmath$\tilde{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}_{n}=\frac{1}{n}\sum_{j=1}^{n}\frac{\mbox{\boldmath$\psi$\unboldmath}(Z_{j})}{1+{\mathbf{u}}_{n}(Z_{j})^{\top}\tilde{}\mbox{\boldmath$\zeta$\unboldmath}_{n}}\quad\mbox{and}\quad\mbox{\boldmath$\hat{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}_{n}=\frac{1}{n}\sum_{j=1}^{n}\frac{\mbox{\boldmath$\psi$\unboldmath}(Z_{j})}{1+\hat{\mathbf{u}}_{n}(Z_{j})^{\top}\hat{}\mbox{\boldmath$\zeta$\unboldmath}_{n}},

W_{n} = Var (u_{n} (Z)), \overset{ˉ}{W}_{n} = \frac{1}{n} j = 1 \sum n (u_{n} u_{n}^{⊤}) (Z_{j}), \hat{W}_{n} = \frac{1}{n} j = 1 \sum n (\hat{u}_{n} \hat{u}_{n}^{⊤}) (Z_{j}) .

W_{n} = Var (u_{n} (Z)), \overset{ˉ}{W}_{n} = \frac{1}{n} j = 1 \sum n (u_{n} u_{n}^{⊤}) (Z_{j}), \hat{W}_{n} = \frac{1}{n} j = 1 \sum n (\hat{u}_{n} \hat{u}_{n}^{⊤}) (Z_{j}) .

0 < n in f ∥ u ∥ = 1 in f u^{⊤} W_{n} u \leq n sup ∥ u ∥ = 1 sup u^{⊤} W_{n} u < \infty.

0 < n in f ∥ u ∥ = 1 in f u^{⊤} W_{n} u \leq n sup ∥ u ∥ = 1 sup u^{⊤} W_{n} u < \infty.

1 \leq j \leq n max ∥ u_{n} (Z_{j}) ∥ = o_{p} (m_{n}^{- 3/2} n^{1/2}),

1 \leq j \leq n max ∥ u_{n} (Z_{j}) ∥ = o_{p} (m_{n}^{- 3/2} n^{1/2}),

∣ \overset{ˉ}{W}_{n} - W_{n} ∣_{o} = o_{p} (m_{n}^{- 1}),

∣ \overset{ˉ}{W}_{n} - W_{n} ∣_{o} = o_{p} (m_{n}^{- 1}),

\frac{1}{n}\sum_{j=1}^{n}\left(\mbox{\boldmath$\psi$\unboldmath}(Z_{j})\otimes{\mathbf{u}}_{n}(Z_{j})-E\big{(}\mbox{\boldmath$\psi$\unboldmath}(Z_{j})\otimes{\mathbf{u}}_{n}(Z_{j})\big{)}\right)=o_{p}(m_{n}^{-1/2}).

\frac{1}{n}\sum_{j=1}^{n}\left(\mbox{\boldmath$\psi$\unboldmath}(Z_{j})\otimes{\mathbf{u}}_{n}(Z_{j})-E\big{(}\mbox{\boldmath$\psi$\unboldmath}(Z_{j})\otimes{\mathbf{u}}_{n}(Z_{j})\big{)}\right)=o_{p}(m_{n}^{-1/2}).

\sqrt{n}(\mbox{\boldmath$\tilde{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}_{n}-\mbox{\boldmath$\theta$\unboldmath}){\Longrightarrow}\mathscr{N}(0,\varSigma_{0}),

\sqrt{n}(\mbox{\boldmath$\tilde{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}_{n}-\mbox{\boldmath$\theta$\unboldmath}){\Longrightarrow}\mathscr{N}(0,\varSigma_{0}),

\int u d Q_{t} = 0, u \in [u_{\infty}] .

\int u d Q_{t} = 0, u \in [u_{\infty}] .

\int u a d Q = 0, u \in [u_{\infty}] .

\int u a d Q = 0, u \in [u_{\infty}] .

D({\mathbf{x}})=1-\|E\big{(}\mathbb{S}({\mathbf{x}}-{\mathbf{X}})\big{)}\|,\quad{\mathbf{x}}\in{\mathcal{R}}^{p},

D({\mathbf{x}})=1-\|E\big{(}\mathbb{S}({\mathbf{x}}-{\mathbf{X}})\big{)}\|,\quad{\mathbf{x}}\in{\mathcal{R}}^{p},

D_{n}({\mathbf{x}})=1-\Big{\|}\frac{1}{n}\sum_{i=1}^{n}\mathbb{S}_{\mathbf{x}}({\mathbf{X}}_{i})\Big{\|}.

D_{n}({\mathbf{x}})=1-\Big{\|}\frac{1}{n}\sum_{i=1}^{n}\mathbb{S}_{\mathbf{x}}({\mathbf{X}}_{i})\Big{\|}.

\mathbf{m}_{n}=\arg\max_{{\mathbf{x}}\in{\mathcal{R}}^{p}}D_{n}({\mathbf{x}})=\arg\min_{{\mathbf{x}}\in{\mathcal{R}}^{p}}\Big{\|}\frac{1}{n}\sum_{i=1}^{n}\mathbb{S}_{\mathbf{x}}({\mathbf{X}}_{i})\Big{\|}.

\mathbf{m}_{n}=\arg\max_{{\mathbf{x}}\in{\mathcal{R}}^{p}}D_{n}({\mathbf{x}})=\arg\min_{{\mathbf{x}}\in{\mathcal{R}}^{p}}\Big{\|}\frac{1}{n}\sum_{i=1}^{n}\mathbb{S}_{\mathbf{x}}({\mathbf{X}}_{i})\Big{\|}.

\widetilde{D}_{n}({\mathbf{x}})=1-\Big{\|}\frac{1}{n}\sum_{i=1}^{n}\frac{\mathbb{S}_{\mathbf{x}}({\mathbf{X}}_{i})}{1+{\mathbf{u}}({\mathbf{X}}_{i})^{\top}\mbox{\boldmath$\tilde{\zeta}$\unboldmath}}\Big{\|},\quad{\mathbf{x}}\in{\mathcal{R}}^{p},

\widetilde{D}_{n}({\mathbf{x}})=1-\Big{\|}\frac{1}{n}\sum_{i=1}^{n}\frac{\mathbb{S}_{\mathbf{x}}({\mathbf{X}}_{i})}{1+{\mathbf{u}}({\mathbf{X}}_{i})^{\top}\mbox{\boldmath$\tilde{\zeta}$\unboldmath}}\Big{\|},\quad{\mathbf{x}}\in{\mathcal{R}}^{p},

\sum_{j=1}^{n}\frac{{\mathbf{u}}({\mathbf{X}}_{j})}{1+{\mathbf{u}}({\mathbf{X}}_{j})^{\top}\mbox{\boldmath$\zeta$\unboldmath}}=0.

\sum_{j=1}^{n}\frac{{\mathbf{u}}({\mathbf{X}}_{j})}{1+{\mathbf{u}}({\mathbf{X}}_{j})^{\top}\mbox{\boldmath$\zeta$\unboldmath}}=0.

\widetilde{\mathbf{m}}=\arg\max_{{\mathbf{x}}\in{\mathcal{R}}^{p}}\widetilde{D}_{n}({\mathbf{x}})=\arg\min_{{\mathbf{x}}\in{\mathcal{R}}^{p}}\Big{\|}\frac{1}{n}\sum_{i=1}^{n}\frac{\mathbb{S}({\mathbf{x}}-{\mathbf{X}}_{i})}{1+{\mathbf{u}}({\mathbf{X}}_{i})^{\top}\mbox{\boldmath$\tilde{\zeta}$\unboldmath}}\Big{\|}.

\widetilde{\mathbf{m}}=\arg\max_{{\mathbf{x}}\in{\mathcal{R}}^{p}}\widetilde{D}_{n}({\mathbf{x}})=\arg\min_{{\mathbf{x}}\in{\mathcal{R}}^{p}}\Big{\|}\frac{1}{n}\sum_{i=1}^{n}\frac{\mathbb{S}({\mathbf{x}}-{\mathbf{X}}_{i})}{1+{\mathbf{u}}({\mathbf{X}}_{i})^{\top}\mbox{\boldmath$\tilde{\zeta}$\unboldmath}}\Big{\|}.

\mbox{\boldmath$\tilde{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}({\mathbf{x}})=\frac{1}{n}\sum_{i=1}^{n}\frac{\mathbb{S}_{\mathbf{x}}({\mathbf{X}}_{i})}{1+{\mathbf{u}}({\mathbf{X}}_{i})^{\top}\mbox{\boldmath$\tilde{\zeta}$\unboldmath}},\quad{\mathbf{x}}\in{\mathbf{R}}^{p}.

\mbox{\boldmath$\tilde{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}({\mathbf{x}})=\frac{1}{n}\sum_{i=1}^{n}\frac{\mathbb{S}_{\mathbf{x}}({\mathbf{X}}_{i})}{1+{\mathbf{u}}({\mathbf{X}}_{i})^{\top}\mbox{\boldmath$\tilde{\zeta}$\unboldmath}},\quad{\mathbf{x}}\in{\mathbf{R}}^{p}.

E (a (ε)) = 0, a \in L_{2, 0} (F, odd) .

E (a (ε)) = 0, a \in L_{2, 0} (F, odd) .

\mbox{\boldmath$\tilde{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}_{n}({\mathbf{x}})=\frac{1}{n}\sum_{i=1}^{n}\frac{\mathbb{S}_{\mathbf{x}}({\mathbf{X}}_{i})}{1+{\mathbf{u}}({\mathbf{X}}_{i})^{\top}\mbox{\boldmath$\tilde{\zeta}$\unboldmath}_{n}},\quad{\mathbf{x}}\in{\mathbf{R}}^{p}.

\mbox{\boldmath$\tilde{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}_{n}({\mathbf{x}})=\frac{1}{n}\sum_{i=1}^{n}\frac{\mathbb{S}_{\mathbf{x}}({\mathbf{X}}_{i})}{1+{\mathbf{u}}({\mathbf{X}}_{i})^{\top}\mbox{\boldmath$\tilde{\zeta}$\unboldmath}_{n}},\quad{\mathbf{x}}\in{\mathbf{R}}^{p}.

\mbox{\boldmath$\tilde{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}_{n}({\mathbf{x}})=\bar{\mathbb{S}}_{\mathbf{x}}-\bar{}\mbox{\boldmath$\varphi$\unboldmath}_{{\mathbf{x}}0}+o_{p}(n^{-1/2}),

\mbox{\boldmath$\tilde{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}_{n}({\mathbf{x}})=\bar{\mathbb{S}}_{\mathbf{x}}-\bar{}\mbox{\boldmath$\varphi$\unboldmath}_{{\mathbf{x}}0}+o_{p}(n^{-1/2}),

\sqrt{n}(\mbox{\boldmath$\tilde{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}_{n}({\mathbf{x}})-\mbox{\boldmath$\theta$\unboldmath}({\mathbf{x}})){\Longrightarrow}\mathscr{N}(0,\varSigma_{0}({\mathbf{x}})).

\sqrt{n}(\mbox{\boldmath$\tilde{}\mbox{\boldmath$\theta$\unboldmath}$\unboldmath}_{n}({\mathbf{x}})-\mbox{\boldmath$\theta$\unboldmath}({\mathbf{x}})){\Longrightarrow}\mathscr{N}(0,\varSigma_{0}({\mathbf{x}})).

n E (∣ \overset{ˉ}{W}_{n} - W_{n} ∣_{o}^{2}) \leq E (∥ u_{n} (X) ∥^{4}) \leq m_{n}^{2} .

n E (∣ \overset{ˉ}{W}_{n} - W_{n} ∣_{o}^{2}) \leq E (∥ u_{n} (X) ∥^{4}) \leq m_{n}^{2} .

n E (∥ K_{n} ∥^{2}) \leq E (∥ S_{x} (X) \otimes u_{n} (X) ∥^{2}) \leq m_{n} E (∥ S_{x} (X) ∥^{2}) = m_{n} .

n E (∥ K_{n} ∥^{2}) \leq E (∥ S_{x} (X) \otimes u_{n} (X) ∥^{2}) \leq m_{n} E (∥ S_{x} (X) ∥^{2}) = m_{n} .

\mathop{\rm Var}\nolimits({\mathbf{P}}_{{\mathbf{x}}}({\mathbf{X}}))=E\big{(}\mathbb{S}_{\mathbf{x}}({\mathbf{X}})v\otimes{\mathbf{u}}({\mathbf{X}})^{\top}\big{)}{\mathbf{W}}^{-1}E\big{(}\mathbb{S}_{\mathbf{x}}({\mathbf{X}})\otimes{\mathbf{u}}({\mathbf{X}})\big{)}.

\mathop{\rm Var}\nolimits({\mathbf{P}}_{{\mathbf{x}}}({\mathbf{X}}))=E\big{(}\mathbb{S}_{\mathbf{x}}({\mathbf{X}})v\otimes{\mathbf{u}}({\mathbf{X}})^{\top}\big{)}{\mathbf{W}}^{-1}E\big{(}\mathbb{S}_{\mathbf{x}}({\mathbf{X}})\otimes{\mathbf{u}}({\mathbf{X}})\big{)}.

n (D_{n} (x) - D (x)) ⟹ N (0, W_{0} (x)) .

n (D_{n} (x) - D (x)) ⟹ N (0, W_{0} (x)) .

n (D_{n} (x) - D (x)) ⟹ N (0, W (x)),

n (D_{n} (x) - D (x)) ⟹ N (0, W (x)),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Advanced Statistical Methods and Models · Statistical Methods and Bayesian Inference

Full text

Easy Maximum Empirical Likelihood Estimation of Linear Functionals

Of A Probability Measure With Infinitely Many Constraints

Shan Wang label=e1][email protected] [

Hanxiang Penglabel=e2][email protected] [ University of San Francisco

Department of Mathematics and Statistics

San Francisco, CA 94117, USA

[email protected]

Indiana University Purdue University Indianapolis

Department of Mathematical Sciences

Indianapolis, IN 46202-3267, USA

[email protected]

Abstract

In this article, we construct semiparametrically efficient estimators of linear functionals of a probability measure in the presence of side information using an easy empirical likelihood approach. We use estimated constraint functions and allow the number of constraints to grow with the sample size. Considered are three cases of information which can be characterized by infinitely many constraints: (1) the marginal distributions are known, (2) the marginals are unknown but identical, and (3) distributional symmetry. An improved spatial depth function is defined and its asymptotic properties are studied. Simulation results on efficiency gain are reported.

Empirical likelihood; Infinitely many constraints; Maximum empirical likelihood estimator; Semiparametric efficiency; Spatial median,

62G05; ,

62G20, 62H11,

keywords:

[class=AMS]

\startlocaldefs

and th1Corresponding author

1 Introduction

Suppose that $Z_{1},\dots,Z_{n}$ are independent and identically distributed (i.i.d.) random variables with a common distribution $Q$ taking values in a measurable space ${\mathcal{Z}}$ . In this article, we are interested in efficient estimation of the linear functional $\mbox{\boldmath$ \theta $\unboldmath}=\int\mbox{\boldmath$ \psi $\unboldmath}\,dQ$ of $Q$ for some square-integrable function $\psi$ from ${\mathcal{Z}}$ to ${\mathcal{R}}^{r}$ when side information is available through a vector function (constraint) ${\mathbf{u}}$ which satisfies

(C)

${\mathbf{u}}$ is measurable from ${\mathcal{Z}}$ to ${\mathcal{R}}^{m}$ such that $\int{\mathbf{u}}\,dQ=0$ and the variance-covariance matrix $\int{\mathbf{u}}{\mathbf{u}}^{\top}\,dQ$ is nonsingular.

The commonly used sample mean $\bar{}\mbox{\boldmath$ \psi $\unboldmath}=\frac{1}{n}\sum_{j=1}^{n}\mbox{\boldmath$ \psi $\unboldmath}(Z_{j})$ of $\mbox{\boldmath$ \theta $\unboldmath}=E(\mbox{\boldmath$ \psi $\unboldmath}(Z))$ does not use the information, and is not efficient in the sense of least dispersed regular estimators, see e.g. Bickel, Klaassen, Ritov and Wellner (1993). Based on the criterion of maximum empirical likelihood, an improved estimator which utilizes the information is

[TABLE]

where $\tilde{\zeta}$ is the solution to the equation

[TABLE]

We shall refer to $\tilde{}\mbox{\boldmath$ \theta $\unboldmath}$ as the EL-weighted estimator.

There is an extensive amount of literature on the empirical likelihood testing of hypothesis, see e.g. Owen (1988, 2001). Soon it was used to construct point estimators. Qin and Lawless (1994) studied maximum empirical likelihood estimators (MELE) and showed in Corollary 2 that MELE are fully efficient. As a special case of MELE, estimators of the preceding easy form were studied in Zhang (1995, 1997) in M-estimation and quantile processes in the presence of auxiliary information (side information). For a fixed number $m$ of known constraint functions, the asymptotic normality (ASN) and efficiency of MELE were established.

Hjort, McKeague and Van Keilegom (2009) extended the scope of the empirical likelihood testing hypothesis, and developed a general theory for constraints with nuisance parameters and considered the case with infinitely many constraints. Peng and Schick (2013) generalized the empirical likelihood testing to allow for the number of constraints to grow with the sample size and for the constraints to use estimated criteria functions. Peng and Tan (2018) expanded the results of the latter to U-statistics based general estimating equations with side information.

Parente and Smith (2011) studied generalized empirical likelihood estimators for irregular constraints. Peng and Schick (2018) presented a theory of maximum empirical likelihood estimation and empirical likelihood ratio testing with irregular and estimated constraint functions. Wang and Peng (2022) used the easy EL-weighted approach to construct improved estimators of linear functionals of a probability measure when side information is available. Motivated by nuisance parameters common in semiparametric models and the infinite dimension of such models, they studied the use of estimated functions for growing number of constraints with the sample size. They applied the results to improve estimation efficiency in the structural equation models.

We shall rely the results of Wang and Peng (2022) to construct efficient estimators of linear functionals of a probability measure for a few cases of side information which is determined by infinitely many constraints. Bickel, Ritov and Wellner (1991) characterized efficient estimation of $E(h(X;Y))$ for known $h$ when the marginal distributions of $X$ and of $Y$ are known, and construct an efficient estimator based on the criterion of minimum chisquare-type objective function. Peng and Schick (2005) calculated the information lower bound when the marginal distributions are unknown but identical, and constructed an efficient estimator based on the criterion of least squares objective. Peng and Schick (2018) constructed empirical likelihood tests of stochastic independence and distributional symmetry. Each of independence, symmetry, known or equal marginal distributions is equivalent to infinitely many equations (constraints), and can be used to improve estimation efficiency. Here we construct the EL-weighted estimators and demonstrate the semiparametric efficiency. Note the simple analytic form of our estimators, and the property of easy incorporation of side information to improve efficiency.

The efficiency criteria used are that of a least dispersed regular estimator or that of a locally asymptotic minimax estimator, and are based on the convolution theorems and on the lower bounds of the local asymptotic risk in LAN and LAMN families, see the monograph by Bickel, et al. (1993) among others.

In what follows, we will summarize some results from Wang and Peng (2022) for the convenience of our use. Meanwhile, we provide the proof of the semiparametric effiency. In many semiparametric models, the constraint vector function ${\mathbf{u}}=(u_{1},...,u_{m})^{\top}$ is usually unknown and must be estimated by some measurable function $\hat{\mathbf{u}}=(\hat{u}_{1},...,\hat{u}_{m})^{\top}$ . Using it, we now work with the EL-weights,

[TABLE]

where $\hat{\zeta}$ solves Eqt (1.2) with ${\mathbf{u}}=\hat{\mathbf{u}}$ . A natural estimate $\hat{}\mbox{\boldmath$ \theta $\unboldmath}$ of $\theta$ now is

[TABLE]

We now allow the number of constraints to depend on the sample size $n$ , $m=m_{n}$ , and tend to infinity slowly with $n$ . To stress the dependence, write

[TABLE]

and $\mbox{\boldmath$ \tilde{}\mbox{\boldmath $\theta$ \unboldmath} $\unboldmath}_{n}=\mbox{\boldmath$ \tilde{}\mbox{\boldmath $\theta$ \unboldmath} $\unboldmath}$ , $\mbox{\boldmath$ \hat{}\mbox{\boldmath $\theta$ \unboldmath} $\unboldmath}_{n}=\mbox{\boldmath$ \hat{}\mbox{\boldmath $\theta$ \unboldmath} $\unboldmath}$ for the corresponding estimators of $\theta$ , that is,

[TABLE]

where $\tilde{}\mbox{\boldmath$ \zeta $\unboldmath}_{n}$ and $\hat{}\mbox{\boldmath$ \zeta $\unboldmath}_{n}$ solve Eqt (1.2) with ${\mathbf{u}}=\tilde{\mathbf{u}}_{n}$ and ${\mathbf{u}}=\hat{\mathbf{u}}_{n}$ , respectively,.

The ASN of $\mbox{\boldmath$ \tilde{}\mbox{\boldmath $\theta$ \unboldmath} $\unboldmath}_{n}$ and $\mbox{\boldmath$ \hat{}\mbox{\boldmath $\theta$ \unboldmath} $\unboldmath}_{n}$ are, respectively, given in Theorems 3 and 4 of Wang and Peng (2022), and we now prove the semiparametric efficiency of $\mbox{\boldmath$ \tilde{}\mbox{\boldmath $\theta$ \unboldmath} $\unboldmath}_{n}$ and quote Theorem 4 in the Appendix for convenience of our use. For ${\mathbf{a}}\in{\mathcal{R}}^{m}$ , write $\|{\mathbf{a}}\|$ the euclidean norm. For ${\mathbf{a}},{\mathbf{b}}\in{\mathcal{R}}^{m}$ , write ${\mathbf{a}}\otimes{\mathbf{b}}$ the kronecker product. Let $L_{2}^{m}(Q)=\left\{{\mathbf{f}}=(f_{1},\dots,f_{m})^{\top}:\int\|{\mathbf{f}}\|^{2}\,dQ<\infty\right\}$ , and let $L_{2,0}^{m}(Q)=\left\{{\mathbf{f}}\in L_{2}^{m}(Q):\int{\mathbf{f}}\,dQ=0\right\}$ . For ${\mathbf{f}}\in L_{2}^{m}(Q)$ , write $\bar{\mathbf{f}}=n^{-1}\sum_{j=1}^{n}{\mathbf{f}}(Z_{j})$ the sample average of ${\mathbf{f}}(Z_{1}),\dots,{\mathbf{f}}(Z_{n})$ , and $[{\mathbf{f}}]$ the closed linear span of the components $f_{1},\dots,f_{r}$ in $L_{2}(Q)$ . Let $Z$ be an i.i.d. copy of $Z_{1}$ . Denote by $[{\mathbf{u}}_{\infty}]$ the closed linear span of ${\mathbf{u}}_{\infty}=(u_{1},u_{2},\dots)$ in $L_{2,0}(Q)$ . Set

[TABLE]

Following Peng and Schick (2013), a sequence ${\mathbf{W}}_{n}$ of $m_{n}\times m_{n}$ dispersion matrices is said to be regular if

[TABLE]

Theorem 1.1.

Suppose that ${\mathbf{u}}_{n}$ satisfies (C) for each $m=m_{n}$ such that

[TABLE]

the sequence of $m_{n}\times m_{n}$ dispersion matrices ${\mathbf{W}}_{n}$ is regular and satisfies

[TABLE]

Then $\mbox{\boldmath$ \tilde{}\mbox{\boldmath $\theta$ \unboldmath} $\unboldmath}_{n}$ is semiparametrically efficient as $m_{n}\to\infty$ . Moreover,

[TABLE]

where $\varSigma_{0}=\mathop{\rm Var}\nolimits(\mbox{\boldmath$ \psi $\unboldmath}(Z))-\mathop{\rm Var}\nolimits(\mbox{\boldmath$ \varphi $\unboldmath}_{0}(Z))$ with $\mbox{\boldmath$ \varphi $\unboldmath}_{0}=\Pi(\mbox{\boldmath$ \psi $\unboldmath}|[{\mathbf{u}}_{\infty}])$ .

Proof. We only need to show the efficiency. It suffices to prove that the orthonormal complement ${\mathcal{T}}=[{\mathbf{u}}_{\infty}]^{\perp}$ in $L_{2,0}(Q)$ is the tangent space. To this end, let $Q_{t}:|t|\leq t_{0}$ with $Q_{0}=Q$ be a regular parametric submodel with the score function $a$ . By (C),

[TABLE]

Differentiating both sides of the equality with respect to $t$ at $t=0$ yields

[TABLE]

This shows $a\in{\mathcal{T}}$ . For any bounded $a\in{\mathcal{T}}$ , consider $q_{t}=dQ_{t}/dQ=1+at,|t|\leq t_{0}$ for sufficient small $t_{0}$ . It is clear that $q_{t}$ is a density and the submodel with the density has the score function $a$ which satisfies $\int ua\,dQ=0$ . Since bounded functions in ${\mathcal{T}}$ are dense, it follows that the above conclusion holds for any $a\in{\mathcal{T}}$ . This shows ${\mathcal{T}}$ is the tangent space. $\hfill\square$

The article is organized as follows. In Section 2, the EL-weighted spatial depth function is constructed, and its ASN and efficiency are established in the presence of distributional symmetry. The ASN and efficiency of the EL-weighted estimators of linear functionals are proved when the marginal distribution functions are known in Section 3, and when the marginal distributions are unknown but equal in Section 4. The simulation results are reported in Section 5. Section Appendix contains Theorem 4 of Wang and Peng (2022).

2 The EL-weighted spatial median

In this section, we introduce the EL-weighted spatial depth function, exhibit efficiency and give the asymptotic normality.

The statistical depth functions provide a center-outward ordering of a point in ${\mathcal{R}}^{p}$ with respect to a distribution. High depth values correspond to centrality while low values to “outlyingness”. Depth functions possess robustness property, and can be used to define multivariate medians, which are robust location estimators. Common depth functions include the Tukey depth (halfspace depth), the simplicial depth, the projection depth, and the spatial depth. Here we shall use the easy EL-approach to constructing improved depths, and illustrate it with the spatial depth. The (population) spatial depth function $D({\mathbf{x}})$ with respect to a distribution $F$ is defined as

[TABLE]

where $\mathbb{S}({\mathbf{x}})={\mathbf{x}}/\|{\mathbf{x}}\|$ if ${\mathbf{x}}\neq 0$ ( $\mathbb{S}(0)=0$ ) is the spatial sign function and ${\mathbf{X}}$ has the distribution function (DF) $F({\mathbf{x}})$ , denoted by ${\mathbf{X}}\sim F({\mathbf{x}})$ . The depth function $D({\mathbf{x}})$ can be estimated by the sample depth function given by

[TABLE]

where $\mathbb{S}_{\mathbf{x}}({\mathbf{t}})=\mathbb{S}({\mathbf{t}}-{\mathbf{x}})$ . The sample spatial median $\mathbf{m}_{n}$ is defined as the value which maximizes the depth function, that is,

[TABLE]

Suppose that there is available additional information that can be expressed by a constraint function ${\mathbf{u}}$ . While the sample depth $D_{n}({\mathbf{x}})$ does not utilize the information, the EL-weighted depth function $\widetilde{D}_{n}({\mathbf{x}})$ makes use of it and is defined by

[TABLE]

where $\tilde{\zeta}$ is the solution to the equation

[TABLE]

The EL-weighted spatial median $\widetilde{\mathbf{m}}$ is defined as the value which maximizes the EL-weighted depth function, that is,

[TABLE]

The EL-weighted estimator of $\mbox{\boldmath$ \theta $\unboldmath}({\mathbf{x}})=E(\mathbb{S}_{\mathbf{x}}({\mathbf{X}}))$ is given by

[TABLE]

Remark 2.1.

The sample spatial $D_{n}({\mathbf{x}})$ is robust with the breakdown point $1/2$ . The EL-weighted $\widetilde{D}_{n}({\mathbf{x}})$ improves efficiency but reduces robustness resulted from the zero value of the EL-weights. One can robustify $\widetilde{D}_{n}({\mathbf{x}})$ by truncating the EL-weights from below by a fixed constant. Truncation is commonly used in the inverse probability weighing method. Obviously, truncation leads to certain loss of efficiency.

Known marginal medians. In our simulation study, we looked at the side information that the bivariate random vector ${\mathbf{X}}=(X_{1},X_{2})^{\top}$ has known marginal medians $m_{10}$ and $m_{20}$ . That is, the componentwise median $(m_{10},m_{20})^{\top}$ is known. In this case, ${\mathbf{u}}(x_{1},x_{2})=(\mathbf{1}[x_{1}\leq m_{10}]-1/2,\mathbf{1}[x_{2}\leq m_{20}]-1/2)^{\top}$ . We are motivated as follows. It is well known that the spatial median is a better location estimator than the componentwise median because the former takes into account the correlation of the components while the latter ignores it, see Chen, Dang, Peng and Bart (2009). We are interested in how much information is lost when the componentwise median is used by looking at how much efficiency of the EL-weighted spatial median $\tilde{\mathbf{m}}$ (when the marginal medians are known) gains over the sample spatial median (when the marginal medians are unknown).

Growing number of constraints. Suppose that there exists some constant vector ${\mathbf{c}}$ such that $T={\mathbf{c}}^{\top}{\mathbf{X}}$ is symmetric about some known value $\tau_{0}$ . Let $\varepsilon_{j}={\mathbf{c}}^{\top}{\mathbf{X}}_{j}-\tau_{0},j=1,\ldots,n$ . Then ${\varepsilon}_{j}$ ’s are i.i.d. random variables which are symmetric about zero. Let ${\varepsilon}$ be an i.i.d. copy of ${\varepsilon}_{j}$ ’s, and let $F$ be the distribution function of ${\varepsilon}$ . Let $L_{2,0}(F,\mathrm{odd})$ be the subspace of $L_{2,0}(F)$ consisting of the odd functions. Symmetry of $\varepsilon$ about [math] implies

[TABLE]

Let $s_{k}(t)=\sin(k\pi t),t\in[-1,1],k=1,2,...$ be the orthonormal trigonometric basis. Define $G(t)=2F(t)-1,t\in{\mathcal{R}}$ . Then $G(t)$ is an odd function in $L_{2,0}(F,\mathrm{odd})$ , and $s_{k}(G(t)),k=1,2,...$ form a basis of the space.

In this case, the constraints are ${\mathbf{u}}_{n}({\mathbf{X}}_{j})=(s_{1}(G(\varepsilon_{j})),...,s_{m_{n}}(G(\varepsilon_{j})))^{\top}$ , where we allow $m_{n}$ to grow to infinity slowly with $n$ . The EL-weighted depth function is calculated by (2.1) with ${\mathbf{u}}={\mathbf{u}}_{n}$ and $\mbox{\boldmath$ \tilde{\zeta} $\unboldmath}=\mbox{\boldmath$ \tilde{\zeta} $\unboldmath}_{n}$ which solves Eqt (2.2) with ${\mathbf{u}}={\mathbf{u}}_{n}$ . The EL-weighted estimator of $\mbox{\boldmath$ \theta $\unboldmath}({\mathbf{x}})=E(\mathbb{S}_{\mathbf{x}}({\mathbf{X}}))$ then is

[TABLE]

Theorem 2.1.

Suppose that $F$ is continuous. Then for arbitrary but fixed ${\mathbf{x}}\in{\mathbf{R}}^{p}$ , as $m_{n}\to\infty$ such that $m_{n}^{4}/n\to 0$ , $\mbox{\boldmath$ \tilde{}\mbox{\boldmath $\theta$ \unboldmath} $\unboldmath}_{n}({\mathbf{x}})$ in (2.5) satisfies

[TABLE]

where $\mbox{\boldmath$ \varphi $\unboldmath}_{{\mathbf{x}}0}=\Pi(\mathbb{S}_{\mathbf{x}}({\mathbf{X}})|L_{2,0}(F,\mathrm{odd}))$ is the projection of $\mathbb{S}_{\mathbf{x}}({\mathbf{X}})$ onto $L_{2,0}(F,\mathrm{odd})$ . As a consequence, if $\varSigma_{0}({\mathbf{x}})=\mathop{\rm Var}\nolimits(\mathbb{S}_{\mathbf{x}}({\mathbf{X}}))-\mathop{\rm Var}\nolimits(\mbox{\boldmath$ \varphi $\unboldmath}_{{\mathbf{x}}0}({\mathbf{X}}))$ is nonsingular,

[TABLE]

Proof of Theorem 2.1. We shall apply Theorem 1.1 to prove the result. Since ${\mathbf{W}}_{n}=E(({\mathbf{u}}{\mathbf{u}}^{\top})({\mathbf{X}}))={\mathbf{I}}_{m_{n}}$ is the identity matrix, it follows that (C) holds and ${\mathbf{W}}_{n}$ is regular. As $\|{\mathbf{u}}_{n}({\mathbf{X}}_{j})\|\leq m_{n}^{1/2}$ for each $j$ and $m_{n}^{4}/n=o(1)$ , (1.6) is satisfied, while (1.7) holds in view of the inequalities

[TABLE]

Let ${\mathbf{K}}_{n}$ be the left hand side of (1.8). Then (1.8) follows from

[TABLE]

We now apply Theorem 1.1 to complete the proof. $\hfill\square$

Efficiency gain and ASN for $\tilde{\mathbf{m}}$ . By the properties of empirical likelihood, one concludes that $\widetilde{D}_{n}({\mathbf{x}})$ is a valid depth function at least for large $n$ as all $1+{\mathbf{u}}({\mathbf{X}}_{i})^{\top}\mbox{\boldmath$ \tilde{\zeta} $\unboldmath}>0$ . Fix ${\mathbf{x}}\in{\mathcal{R}}^{p}$ , let ${\mathbf{P}}_{{\mathbf{x}}}$ be the projection of $\mathbb{S}_{\mathbf{x}}({\mathbf{X}})$ onto the closed linear span $[{\mathbf{u}}_{\infty}]=L_{2,0}(F,odd)$ . Then $\varSigma_{0}({\mathbf{x}})=\mathop{\rm Var}\nolimits(\mathbb{S}_{\mathbf{x}}({\mathbf{X}}))-\mathop{\rm Var}\nolimits({\mathbf{P}}_{{\mathbf{x}}}({\mathbf{X}}))$ . Clearly,

[TABLE]

Let $\mathbb{S}_{2}({\mathbf{x}})=\mathbb{S}\big{(}E(\mathbb{S}_{\mathbf{x}}({\mathbf{X}}))\big{)}$ . If ${\mathbf{W}}_{0}({\mathbf{x}}):=\mathbb{S}_{2}({\mathbf{x}}){\mathbf{V}}_{0}({\mathbf{x}})\mathbb{S}_{2}({\mathbf{x}})^{\top}$ is nonsingular, then by Theorem 2.1 for fixed ${\mathbf{x}}\in{\mathbf{R}}$ ,

[TABLE]

Note that the sample depth $D_{n}({\mathbf{x}})$ satisfies

[TABLE]

where ${\mathbf{W}}({\mathbf{x}})=\mathbb{S}_{2}({\mathbf{x}})\mathop{\rm Var}\nolimits(\mathbb{S}_{\mathbf{x}}({\mathbf{X}}))\mathbb{S}_{2}({\mathbf{x}})^{\top}$ . Thus the reduction of the asymptotic variance-covariance of the EL-weighted depth $\widetilde{D}_{n}({\mathbf{x}})$ is

[TABLE]

We now use the Delta method to drive the ASN of the EL-weighted spatial median $\tilde{\mathbf{m}}$ . To this end, we need some results from Chaudhuri (1992) in the case of $m=1$ for which the spatial median corresponds to his multivariate Hodges-Lehmann type location estimate. The following is his Assumption 3.1.

(PC)

${\mathbf{X}}_{1}$ , ${\dots}$ , ${\mathbf{X}}_{n}$ are i.i.d random vectors in ${\mathcal{R}}^{d}$ with an absolutely continuous (with respect to the Lebesgue measure) distribution having a density $f$ that is bounded on every bounded subset of ${\mathcal{R}}^{d}$ .

Assume (PC) and $d\geq 2$ . Let ${\mathbf{H}}({\mathbf{x}})=\|{\mathbf{x}}\|^{-1}({\mathbf{I}}_{d}-{\mathbf{x}}{\mathbf{x}}^{\top}/\|{\mathbf{x}}\|^{2})$ if ${\mathbf{x}}\neq 0$ and ${\mathbf{H}}(0)=0$ . Note that $\mathbb{S}({\mathbf{x}})$ and ${\mathbf{H}}({\mathbf{x}})$ are the first and second order partial derivatives of $\|{\mathbf{x}}\|$ . Under (PC), the underlying distribution is absolutely continuous with respect to the Lebesgue measure on ${\mathcal{R}}^{p}(d\geq 2)$ , hence the (population) spatial median $\mathbf{m}_{0}$ uniquely exists and satisfies the equation $E(\mathbb{S}(\mathbf{m}_{0}-{\mathbf{X}}))=0$ . The spatial median $\mathbf{m}_{n}$ satisfies

[TABLE]

Let ${\mathbf{J}}=E\big{(}(\mathbb{S}\mathbb{S}^{\top})({\mathbf{m}}_{0}-{\mathbf{X}})\big{)}$ and ${\mathbf{K}}=E\big{(}{\mathbf{H}}(\mathbf{m}_{0}-{\mathbf{X}})\big{)}$ . Chaudhuri (1992) showed in his Theorem 3.3 and its corollary that if (PC) holds then the matrices ${\mathbf{J}}$ and ${\mathbf{K}}$ are positive definite and $\mathbf{m}_{n}$ satisfies

[TABLE]

Note that the EL-weighted spatial median $\widetilde{\mathbf{m}}_{n}$ satisfies the equation,

[TABLE]

Using the Delta method, we derive, with ${\mathbf{V}}_{0}(\mathbf{m}_{0})={\mathbf{J}}-\mathop{\rm Var}\nolimits({\mathbf{P}}_{\mathbf{m}_{0}}({\mathbf{X}}))$ ,

[TABLE]

where $\mathop{\rm Var}\nolimits({\mathbf{P}}_{\mathbf{m}_{0}}({\mathbf{X}}))$ is calculated by (2.6).

Growing number of estimated constraints. For unknown $F(x)$ , we estimate it by the symmetrized empirical distribution function,

[TABLE]

Let ${\mathbb{G}}(x)=2{\mathbb{F}}(x)-1$ . We thus obtain computable functions $s_{j}({\mathbb{G}}(x))$ . Write ${\mathbf{u}}_{n}$ for ${\mathbf{u}}$ , and estimate it by $\hat{\mathbf{u}}_{n}(x)=(s_{1}({\mathbb{G}}(x)),...,s_{m_{n}}({\mathbb{G}}(x)))^{\top}$ . The EL-weighted estimator of $\mbox{\boldmath$ \theta $\unboldmath}({\mathbf{x}})=E(\mathbb{S}_{\mathbf{x}}(X))$ is now given by

[TABLE]

where $\mbox{\boldmath$ \hat{\zeta} $\unboldmath}_{n}$ solves Eqt (2.2) with ${\mathbf{u}}=\hat{\mathbf{u}}_{n}$ . We have

Theorem 2.2.

Suppose that $F$ is continuous. Then $\mbox{\boldmath$ \hat{}\mbox{\boldmath $\theta$ \unboldmath} $\unboldmath}_{n}$ defined in (2.7) satisfies the conclusions of Theorem 2.1 as $m_{n}\to\infty$ such that $m_{n}^{6}/n\to 0$ .

Proof of Theorem 2.2. We shall use Theorem 6.1 of Wang and Peng (2022) for the proof. First, (C) is satisfied with ${\mathbf{W}}_{n}$ regular as ${\mathbf{W}}_{n}=I_{m_{n}}$ . Next, (6.1) follows from $\|\hat{\mathbf{u}}_{n}(Z_{j})\|^{2}\leq m_{n}$ and $m_{n}^{4}/n=o(1)$ . Let

[TABLE]

Then $\bar{\mathbf{W}}_{n}-{\mathbf{W}}_{n}=o_{p}(m_{n}^{-1})$ follows from $m_{n}^{4}/n=o(1)$ and

[TABLE]

Let $D_{n}=n^{-1}\sum_{j=1}^{n}\|\hat{\mathbf{u}}_{n}({\mathbf{X}}_{j})-{\mathbf{u}}_{n}({\mathbf{X}}_{j})\|^{2}$ . It is easy to see

[TABLE]

Thus (6.2) follows from $D_{n}=o_{p}(m_{n}^{-2})$ to be shown next. To this end, let $\mathbf{s}_{n}=(s_{1},...,s_{m_{n}})^{\top}$ . Then $\|\mathbf{s}_{n}(t)\|\leq m_{n}^{1/2}$ . One verifies $\|\mbox{\boldmath$ \psi $\unboldmath}_{n}^{\prime}(t)\|\leq am_{n}^{3/2}$ for some constant $a$ . Therefore, $D_{n}=o_{p}(m_{n}^{-2})$ follows from $D_{n}=O_{p}(m^{3}_{n}/n)$ and $m_{n}^{5}/n=o(1)$ , in view of

[TABLE]

Denoting $\mbox{\boldmath$ \psi $\unboldmath}({\mathbf{y}})=\mathbb{S}_{\mathbf{x}}({\mathbf{y}})$ , we break

[TABLE]

By Cauchy inequality,

[TABLE]

as $m_{n}^{4}/n=o(1)$ . We now bound the variance by the second moment to get

[TABLE]

as $m_{n}^{2}/n=o(1)$ . Taken together we prove (6.3) – (6.4).

We now show that (6.5) holds with ${\mathbf{v}}_{n}={\mathbf{u}}_{n}$ . To this end, using Taylor expansion we write

[TABLE]

where $G_{nj}^{*}$ lies in between ${\mathbb{G}}(\varepsilon_{j})$ and $G(\varepsilon_{j})$ . It thus follows

[TABLE]

as $m_{n}^{4}/n=o(1)$ . This shows ${\mathbf{L}}_{n}=o_{p}((m_{n}n)^{-1/2})$ . One has $\|\mbox{\boldmath$ \psi $\unboldmath}^{\prime\prime}(t)\|=O_{p}(m_{n}^{5/2})$ . Using this, we get

[TABLE]

as $m_{n}^{6}/n=o(1)$ . This yields $\mathbf{m}_{n}=o_{p}((m_{n}n)^{-1/2})$ . Taken together the desired (6.5) follows. We now apply Theorem 6.1 to finish the proof. $\hfill\square$

3 Efficient estimation of linear functionals with known marginals

Suppose that there is available the information that the marginal distributions $F$ and $G$ of $Q$ are known. This can be characterized by

[TABLE]

Bickel, et al. (1991) and Peng and Schick (2002) constructed efficient estimators of the linear functional $\theta=\int\psi\,dQ$ , and proved the ASN under the assumption,

(K)

There exists $\rho>0$ such that for arbitrary measurable sets $A$ and $B$ ,

[TABLE]

Bickel, et al. (1991) showed that the project of $\psi\in L_{2}(Q)$ onto the sum space $L_{2,0}(F)+L_{2,0}(G)$ uniquely exists. They demonstrated that the asymptotic variance of the efficient estimator $\tilde{\theta}$ of $\theta$ can be substantially less than that of the empirical estimator $n^{-1}\sum_{j=1}^{n}\psi(X_{j},Y_{j})$ . For example, they showed that the empirical DF $n^{-1}\sum_{j=1}^{n}\mathbf{1}[X_{j}\leq 1/2,Y_{j}\leq 1/2]$ of $\theta=P(X\leq 1/2,Y\leq 1/2)$ (taking $\psi_{s,t}(x,y)=\mathbf{1}[x\leq s,y\leq t]$ ) has three times the asymptotic variance of the efficient estimator $\tilde{\theta}$ of $\theta$ in the case that $F$ and $G$ are uniform distributions over $[0,1]$ and $X,Y$ are independent

Here we propose an efficient estimator based on maximum empirical likelihood. Employing a basis $\left\{c_{k}\right\}$ of $L_{2,0}(F)$ and $\left\{d_{k}\right\}$ of $L_{2,0}(G)$ , we can reduce the uncountably many characterizing equations to countably many ones,

[TABLE]

Suppose that $F$ and $G$ are continuous. This allows us to take $c_{k}=b_{k}(F)$ and $d_{k}=b_{k}(G)$ , where $b_{k}(t)$ are the trigonometric basis,

[TABLE]

That is, $\left\{c_{k}\right\}$ and $\left\{d_{k}\right\}$ are bases of $L_{2,0}(F)$ and $L_{2,0}(G)$ , respectively. Using the first $2m_{n}$ terms as constraints, the EL-weighted estimator of $\theta$ is

[TABLE]

where ${\mathbf{u}}_{n}(x,y)=({\mathbf{b}}_{n}(F(x))^{\top},{\mathbf{b}}_{n}(G(y))^{\top})^{\top}$ with ${\mathbf{b}}_{n}=(b_{1},...,b_{m_{n}})^{\top}$ . Using Theorem 1.1, we prove

Theorem 3.1.

Suppose that $F$ and $G$ are continuous. Assume (K). Then, as $m_{n}\to\infty$ such that $m_{n}^{4}/n\to 0$ ,

[TABLE]

where $\varphi_{0}$ is the projection of $\psi$ onto the sum space $L_{2,0}(F)+L_{2,0}(G)$ . Hence,

[TABLE]

where $\Sigma=\mathop{\rm Var}\nolimits(\psi({\mathbf{Z}}))-\mathop{\rm Var}\nolimits(\varphi_{0}({\mathbf{Z}}))$ .

Remark 3.1.

By Bickel, et al. (1991) (pp. 1328–29), the estimator $\tilde{\theta}_{n}$ in (3.3) of $\theta=\int\psi(x,y)\,dQ(x,y)$ is semiparametrically efficient.

Proof of Theorem 3.1. We shall rely on Theorem 1.1. Since $\|{\mathbf{u}}_{n}\|\leq 2\sqrt{m_{n}}$ and $m_{n}^{4}/n=o(1)$ , it follows that (1.6) holds. Thus

[TABLE]

as $m_{n}^{4}/n=o(1)$ . This shows (1.7). Let

[TABLE]

It follows from $m_{n}^{2}/n=o(1)$ that (1.8) holds in view of

[TABLE]

We are now left to prove the regularity of ${\mathbf{W}}_{n}$ . Since ${\mathbf{b}}_{n}$ are the first $m_{n}$ terms of the orthonormal basis $\left\{b_{k}\right\}$ , it follows that $E({\mathbf{b}}_{n}(F(X)){\mathbf{b}}_{n}(F(X))^{\top})={\mathbf{I}}_{m_{n}}$ . The same holds for $G$ . Let ${\mathbf{C}}_{n}=E({\mathbf{b}}_{n}(F(X)){\mathbf{b}}_{n}(G(Y))^{\top})$ . Then ${\mathbf{W}}_{n}$ is the $2m_{n}\times 2m_{n}$ dispersion matrix whose (1,1)- and (2, 2)-blocks are equal to ${\mathbb{I}}_{m_{n}}$ and the (1,2)-block equal to ${\mathbf{C}}_{n}$ . For ${\mathbf{s}},{\mathbf{t}}\in{\mathcal{R}}^{m_{n}}$ with $\|{\mathbf{s}}\|^{2}+\|{\mathbf{t}}\|^{2}=1$ , set ${\mathbf{r}}=({\mathbf{s}}^{\top},{\mathbf{t}}^{\top})^{\top}$ . We have

[TABLE]

By Cauchy inequality,

[TABLE]

It thus follows from (3.5) that ${\mathbf{r}}^{\top}{\mathbf{W}}_{n}{\mathbf{r}}\leq 4$ uniformly in $n$ and the above ${\mathbf{r}}$ . For $a\in L_{2,0}(F)$ and $b\in L_{2,0}(G)$ , (K) implies

[TABLE]

Thus

[TABLE]

Replacing $a$ with $-a$ yields

[TABLE]

Taking $a={\mathbf{s}}^{\top}{\mathbf{b}}_{n}(F)$ and $b={\mathbf{b}}_{n}(G)^{\top}{\mathbf{t}}$ and noticing

[TABLE]

we derive

[TABLE]

By (3.5), we thus arrive at

[TABLE]

Taken together we prove the regularity of ${\mathbf{W}}_{n}$ , and apply Theorem 1.1 to complete the proof. $\hfill\square$

4 Efficient estimation of linear functionals with equal marginals

Suppose that the marginal distributions $F$ and $G$ of $X$ and $Y$ are equal but unknown. This is equivalent to the assertion that

[TABLE]

where $\left\{a_{k}\right\}$ is an orthonormal basis of $L_{2,0}(H)$ with $H=(F+G)/2$ . Assume that $F$ and $G$ are continuous. This allows us take $a_{k}(x)=b_{k}(H(x))$ under the assumption $F=G=H$ , where $\left\{b_{k}\right\}$ is the trigonometric basis in (3.2). As $H$ is unknown, we estimate it by the pooled empirical distribution function,

[TABLE]

This gives us computable functions $b_{k}({\mathbb{H}}(x))$ . Let ${\mathbf{u}}_{n}(x,y)={\mathbf{b}}_{n}(H(x))-{\mathbf{b}}_{n}(H(y)),x,y\in{\mathcal{R}}$ . This is unknown and can be estimated by $\hat{\mathbf{u}}_{n}(x,y)={\mathbf{b}}_{n}({\mathbb{H}}(x))-{\mathbf{b}}_{n}({\mathbb{H}}(y))$ . Using the first $m_{n}$ terms as constraints, the EL-weighted estimator of $\theta=E(\psi(X,Y))$ is given by

[TABLE]

where $\mbox{\boldmath$ \hat{\zeta} $\unboldmath}_{n}$ is the solution to Eqt (1.2) with ${\mathbf{u}}=\hat{\mathbf{u}}_{n}$ .

Peng and Schick (2005) constructed efficient estimators of linear functionals of a bivariate distribution with equal marginals under the condition,

[TABLE]

where ${\mathbb{A}}=\{a\in L_{2,0}(H):\int a^{2}\,dH=1\}$ is the unit sphere in $L_{2,0}(H)$ . They exhibited that the asymptotic variance of an efficient estimator of $\theta$ is about 1/3 of that of the empirical estimator or smaller.

Applying Theorem 6.1, we show that $\hat{\theta}_{n}$ is efficient.

Theorem 4.1.

Suppose that the distribution functions $F$ and $G$ are equal and continuous. Assume (4.3). Then, as $m_{n}\to\infty$ such that $m_{n}^{6}/n\to 0$ , $\hat{\theta}_{n}$ given in (4.2) satisfies

[TABLE]

where $\varphi$ is the projection of $\psi$ onto ${\mathbb{A}}$ . Thus

[TABLE]

where $\Sigma=\mathop{\rm Var}\nolimits(\psi)-\mathop{\rm Var}\nolimits(\varphi)$ .

Remark 4.1.

By Theorem 3 of Peng and Schick (2005), the estimator $\hat{\theta}_{n}$ given in (4.2) of $\theta=\int\psi(x,y)\,dQ(x,y)$ is semiparametrically efficient.

Proof of Theorem 4.1. We shall apply Theorem 6.1. Recalling the trigonometric basis $\left\{b_{k}\right\}$ in (3.2), one readily verifies that ${\mathbf{b}}_{n}=(b_{1},\ldots,b_{m_{n}})^{\top}$ has the properties,

[TABLE]

where ${\mathbf{b}}_{n}^{\prime}$ and ${\mathbf{b}}_{n}^{\prime\prime}$ denote the first and second order derivatives of ${\mathbf{b}}$ .

Recalling ${\mathbf{u}}_{n}(x,y)={\mathbf{b}}_{n}(H(x))-{\mathbf{b}}_{n}(H(y))$ and $\hat{\mathbf{u}}_{n}(x,y)={\mathbf{b}}_{n}({\mathbb{H}}(x))-{\mathbf{b}}_{n}({\mathbb{H}}(y))$ , one gets by the first inequality in (4.4) that

[TABLE]

Hence (6.1) holds as $m^{4}/n=o(1)$ . Noting ${\mathbf{W}}_{n}=E({\mathbf{u}}_{n}({\mathbf{Z}}){\mathbf{u}}_{n}({\mathbf{Z}})^{\top})$ , one has by (4.3) that

[TABLE]

uniformly in $n$ and $\|\mbox{\boldmath$ \lambda $\unboldmath}\|=1$ as both $\mbox{\boldmath$ \lambda $\unboldmath}^{\top}{\mathbf{b}}_{n}(H(X))$ and $\mbox{\boldmath$ \lambda $\unboldmath}^{\top}{\mathbf{b}}_{n}(H(Y))$ live in $A$ . Moreover,

[TABLE]

Thus ${\mathbf{W}}_{n}$ is regular. Let

[TABLE]

Then by the first equality in (4.5),

[TABLE]

Hence $\bar{\mathbf{W}}_{n}-{\mathbf{W}}_{n}=o_{p}(m_{n}^{-1})$ as $m_{n}^{4}/n=o(1)$ . It can be seen

[TABLE]

where $D_{n}=n^{-1}\sum_{j=1}^{n}\|\hat{\mathbf{u}}_{n}({\mathbf{Z}}_{j})-{\mathbf{u}}_{n}({\mathbf{Z}}_{j})\|^{2}$ . Thus (6.2) is implied by

[TABLE]

Using the second inequality in (4.4), we derive

[TABLE]

Hence $D_{n}=O_{p}(m^{3}_{n}/n)$ and (4.6) holds as $m_{n}^{5}/n=o(1)$ . We break

[TABLE]

where

[TABLE]

By Cauchy inequality,

[TABLE]

where the last equality holds as $m_{n}^{4}/n=o(1)$ . We now bound the variance by the second moment and by the first equality in (4.5) to get

[TABLE]

as $m_{n}^{2}/n=o(1)$ . Taken together (6.3) follows. We now show (6.5) holds with ${\mathbf{v}}_{n}={\mathbf{u}}_{n}$ . Using Taylor’s expansion, we write

[TABLE]

where

[TABLE]

where $H_{nj}^{*}$ lies in between ${\mathbb{H}}(X_{j})$ and $H(X_{j})$ . Using the second inequality in (4.4), we get

[TABLE]

as $m_{n}^{4}/n=o(1)$ . This shows ${\mathbf{L}}_{n}=o_{p}((m_{n}n)^{-1/2})$ . Using the third inequality in (4.4), one has as $m_{n}^{6}/n=o(1)$ that

[TABLE]

This yields $\mathbf{M}_{n}=o_{p}((m_{n}n)^{-1/2})$ . Taken together one proves (6.5). This and (4.6) imply (6.4) as $m_{n}^{4}/n=o(1)$ . Clearly, ${\mathbf{U}}_{n}={\mathbf{I}}_{m_{n}}$ satisfies $|{\mathbf{U}}_{n}|_{o}=1=O(1)$ . Peng and Schick (2005) showed that the projection of any $h\in L_{2}(Q)$ onto ${\mathbb{A}}$ uniquely exists under the assumption (4.3). Moreover, it is clear that $b_{k}(H(x))-b_{k}(H(y)),k=1,2,\dots$ is a basis of ${\mathbb{A}}$ , so that $[{\mathbf{u}}_{\infty}]={\mathbb{A}}$ . We now apply Theorem 6.1 to complete the proof. $\hfill\square$

5 Simulations

We ran a simulation study to compare the efficiency of the EL-weighted spatial median $\widetilde{\mathbf{m}}_{n}$ with the sample spatial median $\mathbf{m}_{n}$ in the presence of a variety of side information. Reported on Tables 1–5 are the maximum eigenvalues of the asymptotic variance-covariance matrices and their ratios. Random samples were generated from 2- and 3- dimensional Cauchy distributions, Student $t(3)$ with 3 degrees of freedom (df), the copula distributions (see the details in the Appendix) and the asymmetric Laplace for sample sizes $n=50,100,200,500$ . Based on repetitions $M=2000$ , we calculated the averages of the maximum eigenvalues $\lambda$ and $\tilde{\lambda}$ (i.e. the spectral norms) of the asymptotic variance-covariance matrices of $\mathbf{m}_{n}$ and $\widetilde{\mathbf{m}}_{n}$ , and the ratio $\tilde{\lambda}/\lambda$ . A ratio less than one indicates a reduction in the norm of the variance-covariance matrix of the EL-weighted spatial median from that of the sample spatial median.

For Table 1, the side information is that the componentwise medians are known. For Tables 2–5, the information is that one marginal is symmetric about the origin ( $m=1,3,5$ constraints considered), for which we looked at both known and unknown marginal (estimated by the symmetrizied EDF).

Observe that for the case of known componentwise medians, the efficiency gain of the EL-weighted spatial median over the sample spatial median exceeded 80%; for the case of known or estimated symmetric marginal, the efficiency gain is more than 30%. All the ratios considered are substantially smaller than one, indicating substantial efficiency gains of the EL-weighted spatial depth over the sample depth. The simulation results indicated that the componentwise median is less efficient than the spatial median but not that much for the case considered.

6 Declaration of interest statement

The authors report there are no competing interests to declare.

Appendix

The details of the coupla distributions. The 2-dimensional copula distribution has $N(0,1)$ and $t(3)$ marginals with correlation coefficient $0.5$ . The 3-dimensional copula has two $N(0,1)$ marginals with correlation coefficient $0.5$ and one $t(3)$ marginal which is correlated with each of $N(0,1)$ with correlation $0.1$ . The copula has the joint cumulative distribution function with the uniform marginals, where each uniform marginal is defined by applying the probability integral transform on the cumulative distribution functions of two $N(0,1)$ and one $t(3)$ , respectively.

We cite Theorem 4 of Wang and Peng (2022) below for convenience.

Theorem 6.1.

Suppose ${\mathbf{u}}_{n}=(u_{1},\dots,u_{m_{n}})^{\top}$ satisfies (C) for each $m=m_{n}$ . Let $\hat{\mathbf{u}}_{n}$ be an estimator of ${\mathbf{u}}_{n}$ such that

[TABLE]

for which the $m_{n}\times m_{n}$ dispersion matrices ${\mathbf{W}}_{n}$ is regular,

[TABLE]

there exists some measurable function ${\mathbf{v}}_{n}$ from ${\mathcal{Z}}$ into ${\mathcal{R}}^{m_{n}}$ such that (C) is met for every $m=m_{n}$ , the dispersion matrix ${\mathbf{U}}_{n}={\mathbf{W}}_{n}^{-1/2}\int{\mathbf{v}}_{n}{\mathbf{v}}_{n}^{\top}\,dQ{\mathbf{W}}_{n}^{-\top/2}$ satisfies ${\mathbf{U}}_{n}=O(1)$ ,

[TABLE]

Then $\hat{}\mbox{\boldmath$ \theta $\unboldmath}$ satisfies, as $m_{n}$ tends to infinity, the stochastic expansion,

[TABLE]

where $\mbox{\boldmath$ \varphi $\unboldmath}=\Pi(\mbox{\boldmath$ \psi $\unboldmath}|[{\mathbf{v}}_{\infty}])$ is the projection of $\psi$ onto the closed linear span $[{\mathbf{v}}_{\infty}]$ . Thus

[TABLE]

where $\varSigma=\mathop{\rm Var}\nolimits(\mbox{\boldmath$ \psi $\unboldmath}(Z))-\mathop{\rm Var}\nolimits(\mbox{\boldmath$ \varphi $\unboldmath}(Z))$ .

Bibliography17

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Bickel, P. J. , Ritov, Y. and Wellner, J.A. (1991). Efficient estimation of linear functionals of a probability measure P 𝑃 P with known marginal distributions. Ann. Statist. 19 : 1316–1346.
2[2] Bickel, P.J. , Klaassen, C.A.J. , Ritov, Y. and Wellner, J.A. (1993). Efficient and Adaptive Estimation in Semiparametric Models. Johns Hopkins Univ. Press, Baltimore .
3[3] Chaudhuri, P. (1992). Multivariate location estimation using extension of R-estimates through U-statistics type approach. Ann. Statist. 20 : 897 – 916.
4[4] Chen Y., Dang, X., Peng, H. and Bart, Jr., H.L. (2009). Outlier Detection with the Kernelized Spatial Depth Function. IEEE Transactions on Pattern Analysis and Machine Intelligence 31 : 288 – 305.
5[5] Hjort, N.L. , Mc Keague, I.W. and Van Keilegom, I. (2009). Extending the scope of empirical likelihood. Ann. Statist. 37 : 1079–1111.
6[6] Owen, A. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75 : 237–249.
7[7] Owen, A. (2001). Empirical Likelihood . Chapman & Hall/CRC, London.
8[8] Parente, P. M. D. C. and Smith, R. J. (2011). GEL methods for nonsmooth moment indicators. Econometric Theory 27 : 74–113. MR 2771012

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Easy Maximum Empirical Likelihood Estimation of Linear Functionals

Abstract

keywords:

keywords:

1 Introduction

Theorem 1.1**.**

2 The EL-weighted spatial median

Remark 2.1**.**

Theorem 2.1**.**

Theorem 2.2**.**

3 Efficient estimation of linear functionals with known marginals

Theorem 3.1**.**

Remark 3.1**.**

4 Efficient estimation of linear functionals with equal marginals

Theorem 4.1**.**

Remark 4.1**.**

5 Simulations

6 Declaration of interest statement

Appendix

Theorem 6.1**.**

Theorem 1.1.

Remark 2.1.

Theorem 2.1.

Theorem 2.2.

Theorem 3.1.

Remark 3.1.

Theorem 4.1.

Remark 4.1.

Theorem 6.1.