On shrinkage estimation for balanced loss functions

\'Eric Marchand; William E. Strawderman

arXiv:1904.03171·math.ST·April 8, 2019·J. Multivar. Anal.

On shrinkage estimation for balanced loss functions

\'Eric Marchand, William E. Strawderman

PDF

TL;DR

This paper develops shrinkage estimators for multivariate means under modified balanced loss functions, demonstrating dominance over standard estimators for certain distributions and loss functions, with implications for robustness.

Contribution

It introduces Baranchik-type estimators that dominate benchmarks under new balanced loss functions with concave, completely monotone $ ho$ and $ ho$-like functions.

Findings

01

Proposed estimators dominate the benchmark in normal and scale mixture models.

02

Results extend dominance to a class of concave, completely monotone loss functions.

03

Implications for robustness and simultaneous dominance in multivariate estimation.

Abstract

The estimation of a multivariate mean $θ$ is considered under natural modifications of balanced loss function of the form: (i) $ω ρ (∥ δ - δ_{0} ∥^{2}) + (1 - ω) ρ (∥ δ - θ ∥^{2})$ , and (ii) $ℓ (ω ∥ δ - δ_{0} ∥^{2} + (1 - ω) ∥ δ - θ ∥^{2})$ , where $δ_{0}$ is a target estimator of $γ (θ)$ . After briefly reviewing known results for original balanced loss with identity $ρ$ or $ℓ$ , we provide, for increasing and concave $ρ$ and $ℓ$ which also satisfy a completely monotone property, Baranchik-type estimators of $θ$ which dominate the benchmark $δ_{0} (X) = X$ for $X$ either distributed as multivariate normal or as a scale mixture of normals. Implications are given with respect to model robustness and simultaneous dominance with respect to either $ρ$ or $\ell

Equations102

L_{ω} (θ, δ) = ω ∥ δ - δ_{0} ∥^{2} + (1 - ω) ∥ δ - γ (θ) ∥^{2},

L_{ω} (θ, δ) = ω ∥ δ - δ_{0} ∥^{2} + (1 - ω) ∥ δ - γ (θ) ∥^{2},

ω ρ (∥ δ - δ_{0} ∥^{2}) + (1 - ω) ρ (∥ δ - γ (θ) ∥^{2})

ω ρ (∥ δ - δ_{0} ∥^{2}) + (1 - ω) ρ (∥ δ - γ (θ) ∥^{2})

ℓ (ω ∥ δ - δ_{0} ∥^{2} + (1 - ω) ∥ δ - γ (θ) ∥^{2}),

ℓ (ω ∥ δ - δ_{0} ∥^{2} + (1 - ω) ∥ δ - γ (θ) ∥^{2}),

X ∣ V \sim N_{d} (θ, V I_{d}), V \sim g,

X ∣ V \sim N_{d} (θ, V I_{d}), V \sim g,

δ_{a, r (\cdot)} (X) = (1 - \frac{a r ( ∥ X ∥ ^{2} )}{∥ X ∥ ^{2}}) X,

δ_{a, r (\cdot)} (X) = (1 - \frac{a r ( ∥ X ∥ ^{2} )}{∥ X ∥ ^{2}}) X,

0 \leq r (\cdot) \leq 1, r (\cdot) \neq = 0, r^{'} (\cdot) \geq 0, and (d / d t) (r (t) / t) \leq 0 .

0 \leq r (\cdot) \leq 1, r (\cdot) \neq = 0, r^{'} (\cdot) \geq 0, and (d / d t) (r (t) / t) \leq 0 .

Δ_{ω} (θ, g) = L_{ω} (θ, δ_{0} + (1 - ω) g) - L_{ω} (θ, δ_{0}) .

Δ_{ω} (θ, g) = L_{ω} (θ, δ_{0} + (1 - ω) g) - L_{ω} (θ, δ_{0}) .

Δ_{ω} (θ, g)

Δ_{ω} (θ, g)

=

=

=

R_{ω} (θ, δ_{1, ω}) - R_{ω} (θ, δ_{2, ω})

R_{ω} (θ, δ_{1, ω}) - R_{ω} (θ, δ_{2, ω})

=

=

g_{π, ω} (x)

g_{π, ω} (x)

=

=

=

=

δ_{π, ω} (X)

δ_{π, ω} (X)

=

L_{ω, ρ} (θ, δ) = ω ρ (∥ δ - X ∥^{2}) + (1 - ω) ρ (∥ δ - θ ∥^{2}), 0 \leq ω < 1,

L_{ω, ρ} (θ, δ) = ω ρ (∥ δ - X ∥^{2}) + (1 - ω) ρ (∥ δ - θ ∥^{2}), 0 \leq ω < 1,

Y ∣ W \sim N_{d} (θ, W I_{d}), with W \sim h;

Y ∣ W \sim N_{d} (θ, W I_{d}), with W \sim h;

f (t) = K_{1} \int_{0}^{\infty} e^{- t /2 v} d G (v),

f (t) = K_{1} \int_{0}^{\infty} e^{- t /2 v} d G (v),

ρ^{'} (t) = K_{2} \int_{0}^{\infty} e^{- t /2 τ} d H (τ),

f^{*} (t)

f^{*} (t)

=

E_{θ} (ρ (\frac{a ^{2}}{X ^{'} X})) \leq ρ^{'} (0) E_{θ} (\frac{a ^{2}}{X ^{'} X}) \leq ρ^{'} (0) E_{θ} (\frac{a ^{2}}{Y ^{'} Y});

E_{θ} (ρ (\frac{a ^{2}}{X ^{'} X})) \leq ρ^{'} (0) E_{θ} (\frac{a ^{2}}{X ^{'} X}) \leq ρ^{'} (0) E_{θ} (\frac{a ^{2}}{Y ^{'} Y});

E_{θ} (ρ (\frac{a ^{2} r ^{2} ( ∥ X ∥ ^{2} )}{∥ X ∥ ^{2}})) \leq ρ^{'} (0) E_{θ} (\frac{a ^{2} r ( ∥ Y ∥ ^{2} )}{∥ Y ∥ ^{2}}) .

E_{θ} (ρ (\frac{a ^{2} r ^{2} ( ∥ X ∥ ^{2} )}{∥ X ∥ ^{2}})) \leq ρ^{'} (0) E_{θ} (\frac{a ^{2} r ( ∥ Y ∥ ^{2} )}{∥ Y ∥ ^{2}}) .

ρ (t) \leq ρ (0) + ρ^{'} (0) t = ρ^{'} (0) t,

ρ (t) \leq ρ (0) + ρ^{'} (0) t = ρ^{'} (0) t,

E_{θ} (ρ (\frac{a ^{2} r ^{2} ( ∥ X ∥ ^{2} )}{∥ X ∥ ^{2}}))

E_{θ} (ρ (\frac{a ^{2} r ^{2} ( ∥ X ∥ ^{2} )}{∥ X ∥ ^{2}}))

g_{0} (t) is superharmonic for d \geq 4,

g_{0} (t) is superharmonic for d \geq 4,

Δ h (∥ t ∥^{2}) = 2 d h^{'} (∥ t ∥^{2}) + (t^{'} t) h^{''} (∥ t ∥^{2}),

Δ h (∥ t ∥^{2}) = 2 d h^{'} (∥ t ∥^{2}) + (t^{'} t) h^{''} (∥ t ∥^{2}),

Δ (g_{0} (t))

Δ (g_{0} (t))

\leq

0 < a < \frac{2 ( d - 2 ) K ( 1 - ω ) { E ( W ^{- 1} ) } ^{- 1}}{ω ρ ^{'} ( 0 ) + ( 1 - ω ) K},

0 < a < \frac{2 ( d - 2 ) K ( 1 - ω ) { E ( W ^{- 1} ) } ^{- 1}}{ω ρ ^{'} ( 0 ) + ( 1 - ω ) K},

0 < a < \frac{2 K ^{2} ( 1 - ω ) { E _{0} ( \frac{ρ ^{'} ( ∥ X ∥ ^{2} )}{∥ X ∥ ^{2}} ) } ^{- 1}}{ω ρ ^{'} ( 0 ) + ( 1 - ω ) K} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

On shrinkage estimation for balanced loss functions 111

Éric Marchanda & William E. Strawdermanb

*a Université de Sherbrooke, Département de mathématiques, Sherbrooke Qc, CANADA, J1K 2R1 (e-mail: [email protected]) *

*b Rutgers University, Department of Statistics, 501 Hill Center, Busch Campus, Piscataway, N.J., USA, 08855 (e-mail: [email protected]) *

Summary

The estimation of a multivariate mean $\theta$ is considered under natural modifications of balanced loss function of the form: (i) $\omega\,\rho(\|\delta-\delta_{0}\|^{2})+(1-\omega)\,\rho(\|\delta-\theta\|^{2})$ , and (ii) $\ell\left(\omega\,\|\delta-\delta_{0}\|^{2}+(1-\omega)\,\|\delta-\theta\|^{2}\right)\,$ , where $\delta_{0}$ is a target estimator of $\gamma(\theta)$ . After briefly reviewing known results for original balanced loss with identity $\rho$ or $\ell$ , we provide, for increasing and concave $\rho$ and $\ell$ which also satisfy a completely monotone property, Baranchik-type estimators of $\theta$ which dominate the benchmark $\delta_{0}(X)=X$ for $X$ either distributed as multivariate normal or as a scale mixture of normals. Implications are given with respect to model robustness and simultaneous dominance with respect to either $\rho$ or $\ell$ .

AMS 2010 subject classifications: 62F10, 62J07 (primary); 62C15, 62C20 (secondary).

Keywords and phrases: Balanced loss; Concave loss; Dominance; Multivariate normal; Scale mixture of normals; Shrinkage estimation.

1 Introduction

Balanced loss functions and their role in estimation have captured the interest of many researchers over the years since Arnold Zellner (Zellner, 1994) proposed their use in a regression framework. Balanced loss functions are appealing as they combine proximity of a given estimator $\delta$ to both a target estimator $\delta_{0}$ and the unknown parameter $\theta$ which is being estimated. They relate conceptually to methods for combining estimators (e.g., Judge & Mittlehammer, 2004), as well as penalized least-squares estimation. The study of balanced loss functions has frequently been cast in a regression framework (e.g., Hu & Peng, 2011, and the references therein), but it also has arisen or related to credibility theory, finance, sequential estimation, etc (Baran & Stepień-Baran, 2013; Zhang & Chen, 2018). In Zellner’s framework, the target estimator was least-squares, but such a target can be viewed more broadly (e.g., Jafari Jozani et al., 2006, 2014).

To a large extent, findings in the literature relate to balanced squared error loss

[TABLE]

where for an observable $X\sim f_{\theta}$ , $\gamma(\theta)\in\Gamma\subset\mathbb{R}^{d}$ , $\delta_{0}(X)$ is a target estimator of $\gamma(\theta)$ , $\omega\in[0,1)$ is the weight given to the proximity of $\delta$ to $\delta_{0}$ , and $\delta(X)$ is a given estimator of $\gamma(\theta)$ . In such cases, as presented by Jafari Jozani et al. (2006), as well as Dey et al. (1999), Bayesian estimation as well as the frequentist risk performance under balanced loss $L_{\omega}$ with $\omega>0$ relate precisely to corresponding features under unbalanced loss (i.e., squared error loss) $L_{0}$ (see Theorem 2.2). For instance, given a prior $\pi$ and a corresponding Bayes estimator $\delta_{\pi,0}(X)$ under loss $L_{0}$ , the corresponding Bayes estimator under balanced loss $L_{\omega}$ is simply given by $(1-\omega)\,\delta_{\pi}(X)+\omega\delta_{0}(X)$ . Such relationships are reviewed and briefly illustrated in Section 2.

In contrast, much less is known for the following two natural alternatives or modifications to loss (1.1):

[TABLE]

and

[TABLE]

with $0\leq\omega<1$ , $\rho(\cdot)\geq 0$ , and $\ell(\cdot)\geq 0$ . Balanced loss functions of the type (1.2) were considered by Jafari et al. (2012). They provided Bayesian estimators as well as other type of posterior risk analysis. However, for both losses (1.2) and (1.3), there seems to be no significant known finding for frequentist risk analysis, such as the earlier results for balanced squared error loss.

The objective of this paper is to try to fill such gaps. To achieve this, we focus on the multivariate normal case $X\sim N_{d}(\theta,\sigma^{2}I_{d})$ , as well as scale mixture of normals as defined in (2.4), the target estimator $\delta_{0}(X)=X$ ; and the objective of improving on $X$ . The latter is the maximum likelihood estimator and also is minimax for losses (1.2) as elaborated upon at the outset of Section 3C. We obtain various sufficient conditions for dominance for both losses (1.2) and (1.3). These apply for interesting subclasses of concave $\rho$ ’s and $\ell$ ’s respectively, which are also completely monotone. Shrinkage estimation for multivariate normal models, and more generally spherically symmetric and elliptically symmetric models, has had a long, rich and influential history (e.g., Fourdrinier et al., 2018). The use of a concave loss as well as concave versions of (1.2) and (1.3), is quite appealing, and has motivated previous shrinkage estimation work such as Brandwein & Strawderman (1980, 1991), Brandwein et al. (1993), and Kubokawa et al. (2015), among others.

The paper is organized as follows. We collect some preliminary definitions and results in Section 2.1, before reviewing and illustrating frequentist risk and Bayesian analysis results in Section 2.2 applicable to balanced squared-error loss $L_{\omega}$ . In Sections 3 and 4, we provide conditions for a Baranchik-type estimator to dominate $\delta_{0}(X)=X$ under loss functions (1.2) and (1.3) respectively (i.e., Theorems 3.3 and 4.4). In both cases, the proofs are unified with respect to choice of model and loss, the former with respect to the underlying normal mixture and the latter with respect to the choice of $\rho$ or $\ell$ for the balanced loss. Implications are given in terms of robustness and simultaneous dominance (i.e., Corollary 4.3). Finally, we make use of various techniques and properties relative to concave functions, completely monotone functions, superharmonic functions, and spherically symmetric distributions.

2 Preliminary results and the balanced squared-error loss case

2.1 Preliminary definitions and properties

We assemble here some definitions and properties useful throughout the manuscript. The estimators studied below are based on spherically symmetric distributions $X\sim f(\|x-\theta\|^{2}),x,\theta\in\mathbb{R}^{d}$ , which are scale mixtures of normals. Such distributions admit the representation

[TABLE]

and include many familiar examples such as Normal, Student, Logistic, Laplace, Exponential power (with $f(t)=t^{s},0<s<1$ ), among others. Other than moment finiteness conditions and the restriction to $d\geq 3$ or $d\geq 4$ dimensions, the applicability of our dominance findings will not require any further specific assumptions on $f$ .

A key characterization and property, which brings into play completely monotone functions, is given by the following result (see, e.g., Feller, 1966; Berger, 1975; etc.).

Lemma 2.1.

(a)

A density of the form $f(\|x-\theta\|^{2}),x,\theta\in\mathbb{R}^{d}$ is a scale mixture of normals if and only if $f(\cdot)$ is completely monotone, i.e., $(-1)^{n}f^{n}(t)\geq 0$ , for $n=1,2,\ldots$ , and $t\in\mathbb{R}_{+}\,$ . 2. (b)

The product of two completely monotone functions is completely monotone.

The dominance findings of Sections 3 and 4 relate to Baranchik-type estimators of $\theta$ defined and denoted throughout as:

[TABLE]

with $a>0$ , and the conditions

[TABLE]

These include James-Stein estimators with constant $r(\cdot)$ , and $r(t)=(d-2)\,\sigma^{2}$ in the original $X\sim N_{d}(\theta,\sigma^{2}I_{d})$ case.

2.2 Balanced squared-error loss

We review here, for Bayesian inference and frequentist risk analysis, relationships between balanced loss $L_{\omega}$ and its unbalanced counterpart $L_{0}$ . Such results appear in Dey et al. (1999), as well as in Jafari Jozani et al. (2006). For the former, the findings apply to a multivariate normal model $X\sim N_{d}(\theta,\sigma^{2}I_{d})$ and $\delta_{0}(X)=X$ , while the latter work relates to a more general model $X\sim f_{\theta}$ and target estimator $\delta_{0}$ . Some of the results will serve in later sections, but they are exposed here also to illustrate the facility in which Bayesian analysis and frequentist risk evaluations for $L_{\omega}$ follow from corresponding results for squared-error loss $L_{0}$ .

The following Lemma 2.2 will be used in Section 4 for the analysis of losses in (1.3), but is presented here as it serves to link the frequentist risk under loss $L_{\omega}$ to the risk under squared error loss $L_{0}$ , as presented in Corollary 2.1. To facilitate the presentation that follows, we denote the difference in losses $L_{\omega}$ between estimates $\delta_{0}(x)+(1-\omega)\,g(x)$ and $\delta_{0}(x)$ as

[TABLE]

Lemma 2.2.

Let $X\sim f_{\theta}$ . For the problem of estimating $\gamma(\theta)$ under balanced loss $L_{\omega}$ (as in (1.1)), we have $\Delta_{\omega}(\theta,g)\,=\,(1-\omega)^{2}\;\Delta_{0}(\theta,g)\,$ .

Proof. A decomposition of (2.4) yields

[TABLE]

In terms of the frequentist risk $R_{\omega}$ associated with loss $L_{\omega}$ , given by $R_{\omega}(\theta,\delta)\,=\,\mathbb{E}\,\{L_{\omega}\left(\theta,\delta(X)\right)\,\}$ for an estimator $\delta(X)$ of $\gamma(\theta)$ , the following general result follows from Lemma 2.2.

Corollary 2.1.

Let $X\sim f_{\theta}$ and consider the problem of estimating $\gamma(\theta)$ . The estimator $\delta_{1,\omega}(X)\,=\,\delta_{0}(X)+(1-\omega)g_{1}(X)$ dominates $\delta_{2,\omega}(X)\,=\,\delta_{0}(X)+(1-\omega)g_{2}(X)$ under loss $L_{\omega}$ if and only if $\delta_{1,0}(X)\,=\,\delta_{0}(X)+g_{1}(X)$ dominates $\delta_{2,0}(X)\,=\,\delta_{0}(X)+g_{2}(X)$ under squared error loss $L_{0}$ .

Proof. We have

[TABLE]

which establishes the result. ∎

Now, turning to Bayesian inference, we have an equally simple relationship between balanced loss $L_{\omega}$ and its unbalanced counterpart $L_{0}$ . More precisely, the following well-known result conveniently expresses the Bayes estimator $\delta_{\pi,\omega}$ under $L_{\omega}$ for $\omega>0$ in terms of the Bayes estimator $\delta_{\pi,0}$ under $L_{0}$ , given of course by $\delta_{\pi,0}(X)\,=\,\mathbb{E}(\gamma(\theta)|X)$ .

Theorem 2.1.

For $X\sim f_{\theta}$ and a prior $\theta\sim\pi$ for which $Cov(\gamma(\theta)|x)$ exists for all $x$ , the Bayes estimator $\delta_{\pi,\omega}$ of $\gamma(\theta)$ under loss $L_{\omega}$ is given by $\delta_{\pi,\omega}(X)\,=\,\omega\,\delta_{0}(X)\,+\,(1-\omega)\,\delta_{\pi,0}(X)\,.$

Proof. Write $\delta_{\pi,\omega}(x)\,=\,\delta_{0}(x)\,+\,(1-\omega)\,g_{\pi,\omega}(x)$ for $0\leq\omega<1$ . By definition of the Bayes estimate, we thus have

[TABLE]

From this, the result follows as $\delta_{\pi,\omega}(x)\,=\,\delta_{0}(x)\,+\,(1-\omega)\left[\mathbb{E}\{\gamma(\theta)|x\}\,-\,\delta_{0}(x)\,\right]\,=\,\omega\,\delta_{0}(x)\,+\,(1-\omega)\,\delta_{\pi,0}(x)\,.$ ∎

Now, combining the last two results leads to the following Bayes dominance result.

Corollary 2.2.

For $X\sim f_{\theta}$ , a prior $\theta\sim\pi$ , and the problem of estimating $\gamma(\theta)$ , the Bayes estimator $\delta_{\pi,\omega}(X)$ dominates an estimator $\delta_{0}(X)+(1-\omega)g_{2}(X)$ under $L_{\omega}$ if and only if the Bayes estimator $\delta_{\pi,0}(X)\,=\,\mathbb{E}(\gamma(\theta)|X)$ dominates $\delta_{0}(X)+g_{2}(X)$ under squared-error loss $L_{0}$ .

Several examples can be found in the literature, namely among the references mentioned above. We do provide at the end of this subsection Example 2.1 as an illustration. Before doing so, we briefly address the issue of minimaxity, where relationships between balanced and unbalanced losses are not as immediate (e.g., Jafari Jozani et al., 2006). One situation though does simplify, namely the case where the target estimator $\delta_{0}(X)$ is itself minimax under the unbalanced loss. Moreover, the following result (Jafari Jozani et al., 2012, Theorem 4) holds in general for losses (1.2). As discussed at the outset of Section 3C, this will serve to guarantee that the dominating estimators of Theorem 3.3 are themselves minimax.

Theorem 2.2.

Let $X\sim f_{\theta}$ and consider the problem of estimating $\gamma(\theta)$ under loss (1.2). Suppose that the estimator $\delta_{0}(X)$ is minimax under unbalanced loss $\rho(\|\delta-\gamma(\theta)\|^{2})$ . Then, $\delta_{0}(X)$ is also minimax under loss (1.2) for all $0<\omega<1$ .

Proof. Let $R_{w}$ denote the frequentist risk under loss (1.2). Since $R_{\omega}(\theta,\delta_{0})\,=\,(1-\omega)R_{0}(\theta,\delta_{0})$ , the result is immediate. ∎

Example 2.1.

We consider the classical problem of estimating a multivariate normal mean and illustrate how known Stein estimation results applicable (e.g., Stein 1981; Strawderman, 2003) to squared-error loss $L_{0}$ translate to balanced loss $L_{\omega}$ . Let $X\sim N_{d}(\mu,\sigma^{2}I_{d})$ and $S^{2}\sim\sigma^{2}\chi^{2}_{k}$ be independently distributed with $d\geq 3,k\geq 1$ . Set $\theta=(\mu,\sigma^{2})$ and consider estimating $\tau(\theta)=\mu$ under balanced loss $L_{\omega}$ with target estimator $\delta_{0}(X)=X$ , which is minimax under $\frac{L_{0}(\theta,\delta)}{\sigma^{2}}$ , and thus minimax under loss $\frac{L_{\omega}(\theta,\delta)}{\sigma^{2}}\,=\,\frac{\omega\,\|\delta-X\|^{2}+(1-\omega)\,\|\delta-\theta\|^{2}}{\sigma^{2}}$ by virtue of Theorem 2.2.

(A)

For known $\sigma^{2}$ , any estimator of the form $\delta(X)\,=\,X\,+\,\sigma^{2}\,g(X)$ such that $g$ is weakly differentiable, $\mathbb{E}_{\theta}\|g(X)\|^{2}<\infty$ , and $\|g(X)\|^{2}+2\,div(g(X))\leq 0$ a.e., dominates $\delta_{0}(X)$ under loss $L_{0}$ . It thus follows from Corollary 2.1 that $X+(1-\omega)\,\sigma^{2}\,g(X)$ dominates $\delta_{0}(X)$ under loss $L_{\omega}$ for such $g$ ’s. Such dominating estimators include the James-Stein estimator with $g(t)=-(d-2)t/\|t\|^{2}$ , as well as Baranchik type estimators $\delta_{a,r(\cdot)}(X)$ in (2.5), with $0<a<2(d-2)(1-\omega)\sigma^{2}$ and conditions (2.6) on $r(\cdot)$ . 2. (B)

For unknown $\sigma^{2}$ , with the same conditions on $g$ , estimators of the form $X\,+\,\frac{S^{2}}{k+2}\,g(X)$ dominate $X$ under loss $L_{0}$ . Again, it follows immediately from Corollary 2.1 that estimators $X\,+\,(1-\omega)\,\frac{S^{2}}{k+2}\,g(X)$ dominate $X$ under balanced loss $L_{\omega}$ . 3. (C)

For known $\sigma^{2}$ , Bayes estimators $\delta_{\pi,\omega}(X)$ under balanced loss $L_{\omega}$ and associated with prior density $\pi(\theta)$ , Theorem 2.1 along with a well-known representation for $\delta_{\pi,0}(X)$ tell us that

[TABLE]

where $m(X)\,=\,(2\pi\sigma^{2})^{-1}\,\int_{\mathbb{R}^{d}}e^{-\frac{1}{2\sigma^{2}}\|X-\theta\|^{2}}\pi(\theta)\,d\theta$ is the marginal distribution of $X$ . By virtue of Corollary 2.2, the estimator $\delta_{\pi,\omega}(X)$ dominates $X$ under loss $L_{\omega}$ if and only if $\delta_{\pi,0}(X)$ dominates $X$ under loss $L_{0}$ . With the superharmonicity of either $\pi(\cdot),m(\cdot)$ or $\sqrt{m(\cdot)}$ a sufficient condition for $\delta_{\pi,0}(X)$ to dominate $X$ under loss $L_{0}$ (e,g. Strawderman, 2003), we thus infer that either of these conditions imply that $\delta_{\pi,\omega}(X)$ dominates $X$ under balanced loss $L_{\omega}$ .

3 Risk analysis for loss $\omega\rho(\|\delta-X\|^{2})+(1-\omega)\rho(\|\delta-\theta\|^{2})$

A. The loss function

For a model (2.4), we evaluate the frequentist risk performance of an estimator $\delta(X)$ of $\theta$ under the balanced loss

[TABLE]

which incorporates the target estimator $\delta_{0}(X)=X$ . For the function $\rho$ , we assume the following throughout this section:

Assumption 1.

$\rho(0)=0\,,\,0<\rho^{\prime}(0)<\infty$ , and $\rho^{\prime}$ is completely monotone on $\mathbb{R}_{+}$ , i.e., $(-1)^{k}\,\rho^{(k+1)}(t)\geq 0$ for $t>0$ and for $k=0,1,\ldots$ .

Examples of loss functions $L_{\omega,\rho}$ for which $\rho$ satisfies Assumption 1, other than $\rho(t)=t$ , include: (i) $\rho(t)=1-e^{-t/\alpha}$ with $\alpha>0$ , (ii) $\rho(t)\,=\,\log(1+t)$ , (iii) $\rho(t)=(1+t/\gamma)^{\beta}$ with $\gamma>0,\beta\in(0,1)$ , and (iv) cases $\rho(t)\,=\,z(0)-z(t)$ with $z$ being completely monotone such as $\rho(t)\,=\,r^{2}t/(rt+1)$ with $r>0$ . Case (i) is known as reflected normal loss, while examples (iv) represent a broader class of bounded losses. $L^{\beta}$ losses with $\rho(t)=t^{\beta}$ , $\beta\in(0,1)$ , represent concave choices, but such $\rho$ ’s do not satisfy the finiteness assumption on $\rho^{\prime}(0)$ .

B. Further technical results

We now expand on various technical results which are pivotal to the risk analysis in Subsection 3C.

Lemma 3.3.

Consider $X\sim f(\|x-\theta\|^{2}),x,\theta\in\mathbb{R}^{d}$ , admitting representation (2.4) with mixing variable $V$ , and $\rho$ satisfying Assumption 1. Let $Y\sim f^{*}(\|y-\theta\|^{2})\,=\,\rho^{\prime}(\|y-\theta\|^{2})\,f(\|y-\theta\|^{2})/K\,;y\in\mathbb{R}^{d}$ ; with $K=\mathbb{E}_{0}(\rho^{\prime}(\|X\|^{2}))$ . Then,

(a)

The distribution of $Y$ admits a scale mixture of normals representation

[TABLE] 2. (b)

Moreover, the distribution of $W$ is stochastically smaller than the distribution of $V$ .

Proof. First observe that $f^{*}$ is a density since $K\,=\,\mathbb{E}_{0}(\rho^{\prime}(\|X\|^{2}))\leq\rho^{\prime}(0)<\infty$ . Part (a) thus follows from Lemma 2.1. For part (b), given that $\rho^{\prime}$ and $f$ are completely monotone, they are representable as Laplace transforms (Lemma 2.1):

[TABLE]

for $t\in\mathbb{R}^{d}$ . From this, we have for $t\in\mathbb{R}^{d}$

[TABLE]

Interpreting in terms of scale mixture of normals, we have for $Y\sim f^{*}(\|y-\theta\|^{2})$ representation (3.9) with $W=^{d}\frac{\tau V}{\tau+V}$ . Finally, from this, we have $\mathbb{P}(W\leq s)\,\geq\,\,\mathbb{P}(V\leq s),$ for all $s>0$ and the result follows. ∎

The two lemmas that follow, which we will require, rely partly on properties of superharmonic functions. We recall that a continuous function $g:\mathbb{R}^{d}\to\mathbb{R}$ is superharmonic if and only if: at all $t_{0}\in\mathbb{R}^{d}$ and $r>0$ , the average of $g$ over the surface of the sphere, centered at $t_{0}$ of radius $r$ , $S_{r}(t_{0})=\{t\in\mathbb{R}^{d}:\|t-t_{0}\|=r\}$ is less or equal than $g(t_{0})$ . For twice differentiable $g$ , the superharmonicity of $g$ is equivalent to its Laplacian being less or equal to [math], i.e., $\Delta\,g\leq 0$ with $\Delta\,g\,=\,\sum_{i=1}^{d}\frac{\partial^{2}}{\partial t_{i}^{2}}g(t)$ .

Lemma 3.4.

Let $Z\sim N_{d}(0,I_{d})$ with $d\geq 3$ and let $T=\|\alpha Z+\theta\|^{2}$ with $\alpha>0$ and $\theta\in\mathbb{R}^{d}$ . Then, we have the following:

(a)

$\epsilon(\alpha)\,=\,\mathbb{E}_{\alpha}(\frac{1}{T})$ * is decreasing in $\alpha$ for $\alpha>0$ ;* 2. (b)

$\mathbb{E}\,g(\alpha Z+\theta)$ * is non-increasing in $\alpha$ provided that $g(\cdot)>0$ and $g$ is superharmonic.*

Proof. The proof of part (a) is relegated to an Appendix. For part (b), first denote $U_{m},m>0,$ as a random vector uniformly distributed on the sphere $S_{m}(0)$ centered at [math] of radius $m$ . It suffices to show that $\beta(\alpha,r)=\mathbb{E}\,\left(g(\alpha Z+\theta)|\|Z\|=r\right)$ is for all $r>0$ decreasing in $\alpha$ . Since $(Z|\|Z\|=r)\,\sim\,U_{r}$ independently of $\|Z\|$ , we have $\beta(\alpha,r)\,=\,\mathbb{E}\,\left\{g(U_{\alpha r}+\theta)\right\}$ . Since, for a superharmonic function, the sphere mean is decreasing in the radius (see, e.g., Fourdrinier et al. 2018, Theorem 7.4), we infer that $\beta(\alpha,r)$ is decreasing in $\alpha$ , which concludes the proof. ∎

Lemma 3.5.

Let $\theta\in\mathbb{R}^{d}$ , $a>0$ , and $\rho$ satisfy Assumption 1. Consider $X\sim f(\|x-\theta\|^{2})$ , $Y\sim f^{*}(\|y-\theta\|^{2})$ as in (2.4) and Lemma 3.3, respectively.

(a)

For $d\geq 3$ , we have

[TABLE] 2. (b)

For $d\geq 4$ and $r:\mathbb{R}^{d}\to[0,1]$ a twice-differentiable function that is non-decreasing and concave, we have

[TABLE]

Proof. (a) The first inequality follows from the inequality

[TABLE]

which holds since $\rho$ is concave with $\rho(0)=0$ . The second inequality follows from Lemma 3.3 and part (a) of Lemma 3.4. Indeed, since $X|V\sim N_{d}(\theta,VI_{d})$ , $Y|W\sim N_{d}(\theta,WI_{d})$ , we have, with the notation of Lemma 3.4, $\mathbb{E}_{\theta}(\frac{1}{\|X\|^{2}})\,=\mathbb{E}(\epsilon(\sqrt{V}))$ and $\mathbb{E}_{\theta}(\frac{1}{\|Y\|^{2}})\,=\mathbb{E}(\epsilon(\sqrt{W}))$ , and the result follows since $\epsilon(\cdot)$ is decreasing and $W$ is stochastically smaller than $V$ .

(b) Defining $Z\sim N_{d}(0,I_{d})$ and denoting $g_{0}(t)\,=\,\frac{r(\|t\|^{2})}{\|t\|^{2}}\,,\,t\in\mathbb{R}^{d}$ , we have

[TABLE]

where (i) the two equalities follow from the scale mixture representations of $f$ and $f^{*}$ ; (ii) the first inequality follows since $\rho$ is non-decreasing and $0\leq r^{2}(\|t\|^{2})\leq r(\|t\|^{2})\leq 1$ for $t\in\mathbb{R}^{d}$ , (iii) the second inequality follows from (3.10), and (iv) the third inequality follows from Lemma 3.3, part (b) of Lemma 3.4, as in the above proof of part (a), and from the fact that

[TABLE]

provided $r(\cdot)$ is non-negative, non-decreasing, and concave. Finally, to justify the above, note that, for twice-differentiable $h(\|t\|^{2}),t\in\mathbb{R}^{d}$ ,

[TABLE]

so that the choice $h(\|t\|^{2})=\,g_{0}(t)\,=\,\frac{r(\|t\|^{2})}{\|t\|^{2}}$ yields with a little bit of computation

[TABLE]

since the properties of $r(\cdot)$ imply that $r^{\prime\prime}(u)\leq 0$ and $r(u)\geq ur^{\prime}(u)$ for all $u>0$ . ∎

C. Dominance results

For balanced loss $L_{\omega,\rho}$ with $\rho$ satisfying Assumption 1, a scale mixture of normals distribution on $X$ with $d\geq 3$ , we provide James-Stein and Baranchick-type estimators that dominate $X$ . In such cases, it follows that $X$ is minimax for the unbalanced case $L_{0,\rho}$ with constant risk $R_{0}$ (e.g., Kubokawa et al., 2015). By virtue of Theorem 2.2, $X$ is also minimax for balanced loss $L_{\omega,\rho}$ . The following dominance results thus provide dominating estimators which are also minimax under loss $L_{\omega,\rho}$ .

Theorem 3.3.

Consider $X\sim f(\|x-\theta\|^{2})$ ; $x,\theta\in\mathbb{R}^{d}$ ; admitting representation (2.4), balanced loss function $L_{\omega,\rho}$ as in (3.8) with $\rho$ satisfying Assumption 1.

(a)

If $d\geq 3$ , $\delta_{a}(X)=(1-\frac{a}{\|X\|^{2}})X$ dominates $\delta_{0}(X)=X$ provided

[TABLE]

with $K=\mathbb{E}_{0}(\rho^{\prime}(\|X\|^{2}))$ , and $W$ the mixing variance for $Y\sim f^{*}(\|y-\theta\|^{2})$ as defined in Lemma 3.3. An equivalent expression for the above dominance condition is

[TABLE] 2. (b)

If $d\geq 4$ , a Baranchik-type estimator $\delta_{a,r(\cdot)}(X)$ in (2.5) dominates $\delta_{0}(X)=X$ provided (3.12) holds and provided $r(\cdot)$ satisfies conditions (2.6).

Proof. (a) First, the stated equivalence between (3.12) and (3.13) holds since, on one hand,

[TABLE]

(as $\frac{\|Y\|^{2}}{W}|W\sim\chi^{2}_{d}(0)$ when $\theta=0$ ) and, on the other hand,

[TABLE]

Second, we have for a difference in risks

[TABLE]

where the inequality follows from part (a) of Lemma 3.5 and the concave function inequality $\rho(t_{1})-\rho(t_{2})\leq\rho^{\prime}(t_{1})(t_{1}-t_{2})$ for all $t_{1},t_{2}\geq 0$ . Now, with representation (3.9), by conditioning on $W$ , and by the Stein’s identity and calculation $\mathbb{E}_{\theta}\left[(Y-\theta)^{\prime}\frac{Y}{Y^{\prime}Y})|W\right]\,=W\,\mathbb{E}_{\theta}\,\hbox{div}\left(\frac{Y}{Y^{\prime}Y}\right)\,=\,W\,\mathbb{E}_{\theta}\left(\frac{d-2}{Y^{\prime}Y}\right)$ (with probability $1$ ), we obtain

[TABLE]

By noticing that $\mathbb{E}_{\theta}^{Y|w}\left(\frac{W}{\|Y\|^{2}}\right)$ is increasing in $w>0$ , given that $\frac{\|Y\|^{2}}{W}|W\sim\chi^{2}_{d}(\lambda=\frac{\|\theta\|^{2}}{W})$ and $\chi^{2}_{d}(\lambda)$ distributions are stochastically increasing in $\lambda$ , we infer from (3.14) and the covariance inequality (i.e., $\mathbb{E}f_{1}(W)f_{2}(W)\leq\mathbb{E}f_{1}(W)\,\mathbb{E}f_{2}(W)$ for $f_{1}(\cdot)$ increasing and $f_{2}(\cdot)$ decreasing) that

[TABLE]

From the above, it follows immediately that (3.12) is a sufficient condition for $\Delta_{a}(\theta)$ to be negative for all $\theta$ .

(b) The proof is similar to that of part (a). Using the concave function inequality $\rho(t_{1})-\rho(t_{2})\leq\rho^{\prime}(t_{1})(t_{1}-t_{2})$ , Stein’s identity, and part (a) of Lemma 3.5, we obtain for the difference in risk

[TABLE]

Now, it is easy to verify that $r(t)/t$ is decreasing in $t>0$ under the given conditions on $r(\cdot)$ . Finally, an application of the covariance inequality leads to an inequality as in (3.15) with $\mathbb{E}_{\theta}^{Y|W}\left(\frac{W}{\|Y\|^{2}}\right)$ replaced by $\mathbb{E}_{\theta}^{Y|W}\left(\frac{W\,r(\|Y\|^{2})}{\|Y\|^{2}}\right)$ . The result then follows. ∎

Remark 3.1.

From inequality (3.15), it also follows that dominance occurs, in both parts (a) and (b) of Theorem 3.3 for the quantity $a$ equal to the upper cut-off point in (3.12) (or (3.13)) unless $\rho(t)=t$ and $W$ is degenerate, i.e., original balanced loss and the multivariate normal case.

The proof of Theorem 3.3 is unified with respect to the choice of $\rho$ , the coefficient $\omega$ in the balanced loss, and the underlying scale mixture or normals distribution. To conclude, we point out that the above result can be seen as extensions of Kubokawa et al. (2015), as well as Strawderman (1974), whose results can be seen as particular cases of $\omega=0$ in the former case, and $\omega=0,\rho(t)=t$ in the latter case.

4 Risk analysis for loss

$\ell\left(\omega\|\delta-\delta_{0}\|^{2}+(1-\omega)\|\delta-\theta\|^{2}\right)$

The main dominance finding of this section (Theorem 4.4) relates to a multivariate normal $X\sim N_{d}(\theta,\sigma^{2}I_{d})$ , and more generally to $X\sim f(\|x-\theta\|^{2})$ distributed as a scale mixture of normals as in (2.4). We assess the frequentist risk performance of an estimator $\delta(X)$ of $\gamma(\theta)$ under the balanced loss

[TABLE]

More specifically, we consider the target estimator $\delta_{0}(X)=X$ and set $\gamma(\theta)=\theta$ , and our objective is to provide, for $d\geq 3$ , estimators of $\theta$ that dominate $\delta_{0}(X)=X$ under balanced loss (4.16) other than Section 2’s results for $\ell(t)=t$ . For the function $\ell$ , we assume, unless stated otherwise, the following throughout this section:

Assumption 2.

$\ell(\cdot)\geq 0$ , $\,\ell^{\prime}(\cdot)>0$ , and $\ell^{\prime}$ is completely monotone on $\mathbb{R}_{+}$ , i.e., $(-1)^{k}\,\ell^{(k+1)}(t)\geq 0$ for $t>0$ and for $k=0,1,\ldots$ .

Examples of losses $L_{\omega,\ell}$ with $\ell$ satisfying Assumption 2 include examples (i), (ii), (iii), (iv) given for $\rho$ in part B. of Section 3, but the cases $\ell(t)=t^{\beta},\beta\in(0,1),$ are also included here since the assumption $\ell^{\prime}(0)<\infty$ is not required.

We proceed with a preparatory lemma which exploits the concavity of $\ell$ , and which relates the difference in losses $L_{\omega,\ell}$ , between estimates $\delta_{0}(x)+(1-\omega)\,g(x)$ and $\delta_{0}(x)$ , to the balanced squared-error loss difference $\Delta_{\omega}(\theta,g)$ in (2.7). We therefore define

[TABLE]

and we have the following.

Lemma 4.6.

Let $X\sim f_{\theta}$ . For the problem of estimating $\theta$ under loss (4.16) with twice-differentiable, increasing, and concave $\ell$ , we have

[TABLE]

Proof. The proof uses the fact that $\ell(a+b)\,-\,\ell(a)\leq b\,\ell^{\prime}(a)$ , since $\ell$ is concave, with $a=L_{\omega}(\theta,\delta_{0})$ and $a+b=L_{\omega}(\theta,\delta_{0}+(1-\omega)g)$ . This yields :

[TABLE]

which is indeed (4.18), by virtue of Lemma 2.2 and since $L_{\omega}(\theta,\delta_{0})\,=\,(1-\omega)\,\|\delta_{0}-\gamma(\theta)\|^{2}$ . ∎

A basic result for estimating a mean vector $\theta$ under quadratic loss, for scale mixtures of normal distributions is the following.

Lemma 4.7.

*(Strawderman, 1974)

Let $X\sim f(\|x-\theta\|^{2})$ have a scale mixture of normals distribution as in (2.4) with $d\geq 3$ , and consider estimating $\theta$ with loss $\|\delta-\theta\|^{2}$ . Consider Baranchik-type estimators $\delta_{a,r(\cdot)}(X)$ as in (2.5) with conditions (2.6). Then, $\delta_{a,r(\cdot)}(X)$ dominates $X$ provided*

[TABLE]

and provided $E_{0}[\|X\|^{2}]$ and $E_{0}[1/||X||^{2}]$ are finite.

The main result of this section can now be presented and established.

Theorem 4.4.

Let $X\sim f(\|x-\theta\|^{2})$ have a scale mixture of normals distribution as in (2.4) with $d\geq 3$ , and consider estimating $\theta$ with loss $L_{\omega,\ell}$ , as in (4.16) with $\delta_{0}(X)=X$ , where $\ell^{\prime}$ satisfies Assumption 2. Consider Baranchik-type estimator $\delta_{a(1-\omega),r(\cdot)}(X)$ as in (2.5) with conditions (2.6) on $r(\cdot)$ . Then, assuming that $f_{0}^{*}(x)\,=\,\ell^{\prime}((1-\omega)\|x\|^{2})\,f(\|x\|^{2})/K_{1}$ is a density on $\mathbb{R}^{d}$ , $\delta_{a(1-\omega),r}(X)$ dominates $X$ provided

[TABLE]

and provided $E_{0,\omega}^{*}[\|X\|^{2}]$ and $E_{0,\omega}^{*}[1/||X||^{2}]$ are finite, where the expectation $\mathbb{E}_{0}^{*}$ is taken with respect to $f_{0}^{*}$ .

Proof. With the given notation, observe that $\delta_{a(1-\omega),r}(X)=X+(1-\omega)\,g_{a,r}(X)$ with $g_{a,r}(X)=-\frac{ar(\|X\|^{2})}{\|X\|^{2}}\,X\,$ . Therefore, by Lemma 4.6 with $\delta_{0}(X)=X$ and $\gamma(\theta)=\theta$ , we have for the difference in losses between $\delta_{a(1-\omega),r}(X)$ and $X$ :

[TABLE]

Finally, since both $\ell^{\prime}$ and $f$ are completely monotone, so is the density $f_{0}^{*}$ (Lemma 2.1, part a). This implies that $f_{0}^{*}$ is a scale mixture of normals density and the result thus follows immediately from Lemma 4.7. ∎

Remark 4.2.

For the unbalanced case $\omega=0$ , one recovers Theorem 2.1 of Kubokawa et al. (2015). For the original balanced loss function with $\ell(t)=t$ , one may recover the result of Theorem 4.4 directly by relying on Lemma 4.7 and Lemma 2.2, as illustrated in Example 2.1 for the multivariate normal case.

Moreover, it is interesting to compare the balanced and unbalanced cut-off points $a_{0}(\omega)=\frac{2}{\mathbb{E}_{0,\omega}^{*}(\frac{1}{\|X\|^{2}})}$ and $a_{0}(0)=\frac{2}{\mathbb{E}_{0,0}^{*}(\frac{1}{\|X\|^{2}})}$ . For loss $L_{\omega,\ell}$ with $\ell(t)=t^{\beta}$ , $0<\beta<1$ , we have $a_{0}(\omega)=a_{0}(0)$ for all $\omega\in(0,1)$ . For choices $\ell$ such that $\frac{\ell^{\prime}((1-\omega)t)}{\ell^{\prime}(t)}$ is non-decreasing in $t$ , we have a monotone likelihood ratio ordering for densities $f_{0}^{*}$ and $f$ , with the former being stochastically larger. This implies the ordering $\mathbb{E}_{0,\omega}^{*}(\frac{1}{\|X\|^{2}})\leq\mathbb{E}_{0,0}^{*}(\frac{1}{\|X\|^{2}})$ , and therefore $a_{0}(\omega)\geq a_{0}(0)$ . Examples where such a condition holds include: $\ell(t)=1-e^{-t/\alpha}$ , $\alpha>0$ , $\ell(t)=\log(1+t)$ , $\ell(t)=(1+t/\beta)^{\alpha}$ with $\beta>0,0<\alpha<1$ .

It is also interesting to assess how the upper cut-off point on the multiple $a$ for the estimator $\delta_{a,r(\cdot)}$ to dominate $X$ varies in terms of the model $f$ and the choice of $\ell$ for the loss function. In the former case, one can infer dominance results that are robust, holding for a given $f$ but also persisting for a class of departures from $f$ . This is quite plausible and simple to visualize as the cut-off point depends on $f$ only through the inverse moment $\mathbb{E}_{0,\omega}^{*}(\frac{1}{\|X\|^{2}})$ . For the latter case, one can infer dominance results that hold simultaneously for a subclass of losses (4.16). Here is such an illustration.

Corollary 4.3.

Consider the context of Theorem 4.4, for a given loss $L_{\omega,\ell}$ , and a Baranchik-type estimator $\delta_{a(1-\omega),r(\cdot)}(X)$ which satisfies the given requirements for dominance of $\delta_{0}(X)=X$ . Then, the dominance persists for the original balanced loss $L_{w,\ell_{0}}$ with $\ell_{0}(t)=t$ and $\delta_{0}(X)=X$ .

Proof. It suffices to show that

[TABLE]

where the expectation $\mathbb{E}_{0}^{*}$ is taken with respect to the density $f_{0}^{*}$ given in Theorem 4.4, and where we define the expectation $\mathbb{E}_{0}$ as taken with respect to the density $f(\|x\|^{2})$ . Now observe that the ratio of these densities, proportional to $\ell^{\prime}((1-\omega)\|x\|^{2})$ is decreasing in $\|x\|^{2}$ by assumptions on $\ell$ . We thus have a monotone likelihood ratio in $\|x\|^{2}$ ordering between the densities and inequality (4.19) follows since $1/\|x\|^{2}$ is decreasing in $\|x\|^{2}$ . ∎

5 Concluding remarks

For a multivariate normal distributed $X\sim N_{d}(\theta,\sigma^{2}I_{d})$ , and more generally for a scale mixture of normals model $X\sim f(\|x-\theta\|^{2})$ , we have provided shrinkage estimators of $\theta$ that improve on the benchmark estimator $\delta_{0}(X)$ as measured by the frequentist risk associated with balanced loss functions of the types (1.2) and (1.3), and with completely monotone $\rho$ and $\ell$ . Much of the approach is unified with respect to the choices of $f$ and either $\rho$ or $\ell$ and the findings represent analytical extensions to the original balanced loss with either identity $\rho$ or $\ell$ , unavailable up to now.

The findings in this paper do not cover cases with unknown scale such as observations generated from a $N_{d}(\theta,\sigma^{2}I_{d})$ with unknown $\sigma^{2}$ , such as earlier results on the original balanced loss function (e.g., Chung et al. 1999; Zinodiny, 2014), but we expect that the techniques presented here should be useful to derive corresponding results for analogs of loss functions (1.2) and (1.3). Finally, it would be most interesting and welcomed to obtain Bayesian estimators that either satisfy our conditions of dominance, or dominate the benchmark $\delta_{0}(X)=X$ under the set-up of Theorems 3.3 and 4.4.

Appendix

Proof of Lemma 3.4, part (a)

With $T/\alpha^{2}=\|Z+\frac{\theta}{\alpha}\|^{2}\sim\chi^{2}_{d}(\,\|\theta\|^{2}/\alpha^{2})$ and the Poisson representation of the non-central $\chi^{2}$ distribution (i.e., $T/\alpha^{2}|K\sim\chi^{2}_{d+2K}\,,\,K\sim\hbox{Poisson}(\lambda=\|\theta\|^{2}/2\alpha^{2})$ ), we have for $d\geq 3$ and $\theta\neq 0$

[TABLE]

with $U(K)\,=\,\frac{K}{d+2K-4}\,\mathbb{I}_{\mathbb{N}_{+}}(K)\,$ . Since $U(k)$ is increasing in $k\in\mathbb{N}$ for $d\geq 3$ , since $\lambda$ is decreasing in $\alpha$ , and since the Poisson( $\lambda$ ) distribution has increasing monotone likelihood ratio in $K$ with parameter $\lambda$ , it follows from the above that $\mathbb{E}_{\alpha}(\frac{1}{T})$ is decreasing indeed in $\alpha$ for $d\geq 3$ .

Acknowledgements

Éric Marchand’s research is supported in part by the Natural Sciences and Engineering Research Council of Canada, and William Strawderman’s research is partially supported by a grant from the Simons Foundation (#418098).

Bibliography43

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1]
2[2] Baran, J. & Stepień-Baran, A. (2013). Sequential estimation of a location parameter and powers of a scale parameter from delayed observations. Statistica Neerlandica , 67 , 263–280.
3[3]
4[4] Berger, J.O. (1975). Minimax estimation of location vectors for a wide class of distributions. Annals of Statistics , 3 , 1318–1328.
5[5]
6[6] Brandwein, A.C., Ralescu, S. & Strawderman, W.E. (1993). Shrinkage estimation of the location parameters for certain spherically symmetric distributions. Annals of the Institute of Statistical Mathematics , 45 , 551–565.
7[7]
8[8] Brandwein, A.C. & Strawderman, W.E. (1991) Generalizations of James-Stein estimators under spherical symmetry. Annals of Statistics , 19 , 1639–1650.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

1 Introduction

2 Preliminary results and the balanced squared-error loss case

2.1 Preliminary definitions and properties

Lemma 2.1**.**

2.2 Balanced squared-error loss

Lemma 2.2**.**

Corollary 2.1**.**

Theorem 2.1**.**

Corollary 2.2**.**

Theorem 2.2**.**

Example 2.1**.**

3 Risk analysis for loss ωρ(∥δ−X∥2)+(1−ω)ρ(∥δ−θ∥2)\omega\rho(\|\delta-X\|^{2})+(1-\omega)\rho(\|\delta-\theta\|^{2})ωρ(∥δ−X∥2)+(1−ω)ρ(∥δ−θ∥2)

A. The loss function

Assumption 1**.**

B. Further technical results

Lemma 3.3**.**

Lemma 3.4**.**

Lemma 3.5**.**

C. Dominance results

Theorem 3.3**.**

Remark 3.1**.**

4 Risk analysis for loss

Assumption 2**.**

Lemma 4.6**.**

Lemma 4.7**.**

Theorem 4.4**.**

Remark 4.2**.**

Corollary 4.3**.**

5 Concluding remarks

Appendix

Proof of Lemma 3.4, part (a)

Acknowledgements

Lemma 2.1.

Lemma 2.2.

Corollary 2.1.

Theorem 2.1.

Corollary 2.2.

Theorem 2.2.

Example 2.1.

3 Risk analysis for loss $\omega\rho(\|\delta-X\|^{2})+(1-\omega)\rho(\|\delta-\theta\|^{2})$

Assumption 1.

Lemma 3.3.

Lemma 3.4.

Lemma 3.5.

Theorem 3.3.

Remark 3.1.

Assumption 2.

Lemma 4.6.

Lemma 4.7.

Theorem 4.4.

Remark 4.2.

Corollary 4.3.