Minimizing the expected value of the asymmetric loss and an inequality   of the variance of the loss

Naoya Yamaguchi; Yuka Yamaguchi; and Ryuei Nishii

arXiv:1907.08369·math.ST·March 3, 2023

Minimizing the expected value of the asymmetric loss and an inequality of the variance of the loss

Naoya Yamaguchi, Yuka Yamaguchi, and Ryuei Nishii

PDF

Open Access

TL;DR

This paper proposes a method to adjust predictions to minimize the expected asymmetric loss and variance of the loss, without estimating regression coefficients, by ensuring the prediction error follows a normal distribution.

Contribution

It introduces a novel approach that corrects predictions to optimize asymmetric loss and variance without traditional coefficient estimation.

Findings

01

Effective reduction in expected asymmetric loss.

02

Lowered variance of prediction errors.

03

Applicable to various prediction scenarios.

Abstract

For some estimations and predictions, we solve minimization problems with asymmetric loss functions. Usually, we estimate the coefficient of regression for these problems. In this paper, we do not make such the estimation, but rather give a solution by correcting any predictions so that the prediction error follows a general normal distribution. In our method, we can not only minimize the expected value of the asymmetric loss, but also lower the variance of the loss.

Figures11

Click any figure to enlarge with its caption.

Tables4

Table 1. Table 1 . Case of 0 < a < 1 0 𝑎 1 0<a<1

$x$	$0$	$\dots$	$\frac{3 a + 1}{2}$	$\dots$	$p_{4} (a)$	$\dots$	$+ \infty$
$\frac{d}{d x} y_{4} (a, x)$	$0$	$+$	$0$	$-$	$-$	$-$	$0$
$y_{4} (a, x)$	$0$	$+$	$+$	$+$	$0$	$-$	$-$

Table 2. Table 2 . Case of a ≥ 1 𝑎 1 a\geq 1

$x$	$0$	$\dots$	$\frac{3 a + 1}{2}$	$\dots$	$+ \infty$
$\frac{d}{d x} y_{4} (a, x)$	$0$	$+$	$0$	$-$	$0$
$y_{4} (a, x)$	$0$	$+$	$+$	$+$	$\begin{matrix} 0 (a = 1) \\ + (a > 1) \end{matrix}$

Table 3. Table 3 . Case of 0 < a < 1 0 𝑎 1 0<a<1

$x$	$0$	$\dots$	$p_{2} (a)$	$\dots$	$p_{3} (a)$	$\dots$	$p_{4} (a)$	$\dots$	$+ \infty$
$\frac{d}{d x} y_{3} (a, x)$	$+ \infty$	$+$	$+$	$+$	$+$	$+$	$0$	$-$	$0$
$y_{3} (a, x)$	$-$	$-$	$-$	$-$	$0$	$+$	$+$	$+$	$0$
$\frac{d}{d x} y_{2} (a, x)$	$-$	$-$	$-$	$-$	$0$	$+$	$+$	$+$	$0$
$y_{2} (a, x)$	$+$	$+$	$0$	$-$	$-$	$-$	$-$	$-$	$0$
$\frac{d}{d x} y_{1} (a, x)$	$+$	$+$	$0$	$-$	$-$	$-$	$-$	$-$	$0$
$y_{1} (a, x)$	$0$	$+$	$+$	$+$	$+$	$+$	$+$	$+$	$0$

Table 4. Table 4 . Case of a ≥ 1 𝑎 1 a\geq 1

$x$	$0$	$\dots$	$p_{2} (a)$	$\dots$	$p_{3} (a)$	$\dots$	$+ \infty$
$\frac{d}{d x} y_{3} (a, x)$	$0$	$+$	$+$	$+$	$+$	$+$	$\begin{matrix} 0 (a < 2) \\ + (a = 2) \\ + \infty (a > 2) \end{matrix}$
$y_{3} (a, x)$	$-$	$-$	$-$	$-$	$0$	$+$	$0$
$\frac{d}{d x} y_{2} (a, x)$	$-$	$-$	$-$	$-$	$0$	$+$	$0$
$y_{2} (a, x)$	$+$	$+$	$0$	$-$	$-$	$-$	$0$
$\frac{d}{d x} y_{1} (a, x)$	$+$	$+$	$0$	$-$	$-$	$-$	$0$
$y_{1} (a, x)$	$0$	$+$	$+$	$+$	$+$	$+$	$0$

Equations252

y = X β + ε,

y = X β + ε,

\hat{β} := ar g β min {i = 1 \sum n L (r_{i} (β))} .

\hat{β} := ar g β min {i = 1 \sum n L (r_{i} (β))} .

f_{Z} (z) := \frac{1}{2 ab Γ ( a )} exp (- \frac{z}{b}^{\frac{1}{a}}),

f_{Z} (z) := \frac{1}{2 ab Γ ( a )} exp (- \frac{z}{b}^{\frac{1}{a}}),

L (z) := {k_{1} z, - k_{2} z, z \geq 0, z < 0.

L (z) := {k_{1} z, - k_{2} z, z \geq 0, z < 0.

C = ar g c min {E [L (Z + c)]} .

C = ar g c min {E [L (Z + c)]} .

(1) E [L (Z + c)]

(1) E [L (Z + c)]

(2) V [L (Z + c)]

- \frac{( k _{1} + k _{2} ) ^{2} b ∣ c ∣}{2Γ ( a ) ^{2}} γ (a, \frac{c}{b}^{\frac{1}{a}}) Γ (2 a, \frac{c}{b}^{\frac{1}{a}})

- \frac{( k _{1} + k _{2} ) ^{2} c ^{2}}{4Γ ( a ) ^{2}} γ (a, \frac{c}{b}^{\frac{1}{a}})^{2} - \frac{( k _{1} + k _{2} ) ^{2} b ^{2}}{4Γ ( a ) ^{2}} Γ (2 a, \frac{c}{b}^{\frac{1}{a}})^{2}

+ \frac{( k _{1}^{2} + k _{2}^{2} ) b ^{2} Γ ( 3 a )}{2Γ ( a )} + sgn (c) \frac{( k _{1}^{2} - k _{2}^{2} ) b ^{2}}{2Γ ( a )} γ (3 a, \frac{c}{b}^{\frac{1}{a}}) .

γ (a, \frac{C}{b}^{\frac{1}{a}}) = sgn (C) \frac{k _{2} - k _{1}}{k _{1} + k _{2}} Γ (a)

γ (a, \frac{C}{b}^{\frac{1}{a}}) = sgn (C) \frac{k _{2} - k _{1}}{k _{1} + k _{2}} Γ (a)

E [L (Z + C)] = \frac{( k _{1} + k _{2} ) b}{2Γ ( a )} Γ (2 a, \frac{C}{b}^{\frac{1}{a}}) .

E [L (Z + C)] = \frac{( k _{1} + k _{2} ) b}{2Γ ( a )} Γ (2 a, \frac{C}{b}^{\frac{1}{a}}) .

E [L (Z)] - E [L (Z + C)]

E [L (Z)] - E [L (Z + C)]

\frac{E [ L ( Z + C )]}{E [ L ( Z )]}

V [L (Z + C)] \leq V [L (Z)],

V [L (Z + C)] \leq V [L (Z)],

x^{a} γ (a, x)^{2} - x^{a} Γ (a)^{2} + 2 γ (a, x) Γ (2 a, x) > 0.

x^{a} γ (a, x)^{2} - x^{a} Γ (a)^{2} + 2 γ (a, x) Γ (2 a, x) > 0.

2Γ (2 a) - a Γ (a)^{2} > 0.

2Γ (2 a) - a Γ (a)^{2} > 0.

4^{a} Γ (a + \frac{1}{2}) > π Γ (a + 1) .

4^{a} Γ (a + \frac{1}{2}) > π Γ (a + 1) .

Γ (a) := \int_{0}^{+ \infty} t^{a - 1} e^{- t} d t, Re (a) > 0.

Γ (a) := \int_{0}^{+ \infty} t^{a - 1} e^{- t} d t, Re (a) > 0.

f_{Z} (z) := \frac{1}{2 ab Γ ( a )} exp (- \frac{z}{b}^{\frac{1}{a}}),

f_{Z} (z) := \frac{1}{2 ab Γ ( a )} exp (- \frac{z}{b}^{\frac{1}{a}}),

L (z) := {k_{1} z, - k_{2} z, z \geq 0, z < 0.

L (z) := {k_{1} z, - k_{2} z, z \geq 0, z < 0.

Γ (a, x) := \int_{x}^{+ \infty} t^{a - 1} e^{- t} d t, γ (a, x) := \int_{0}^{x} t^{a - 1} e^{- t} d t,

Γ (a, x) := \int_{x}^{+ \infty} t^{a - 1} e^{- t} d t, γ (a, x) := \int_{0}^{x} t^{a - 1} e^{- t} d t,

(1) γ (a, x) + Γ (a, x) = Γ (a);

(1) γ (a, x) + Γ (a, x) = Γ (a);

(2) x \to \infty lim γ (a, x) = Γ (a);

(3) Γ (a, 0) = Γ (a);

(4) \frac{d}{d x} γ (a, x) = x^{a - 1} e^{- x};

(5) \frac{d}{d x} Γ (a, x) = - x^{a - 1} e^{- x} .

(1) E [L (Z + c)]

(1) E [L (Z + c)]

(2) V [L (Z + c)]

- \frac{( k _{1} + k _{2} ) ^{2} b ∣ c ∣}{2Γ ( a ) ^{2}} γ (a, \frac{c}{b}^{\frac{1}{a}}) Γ (2 a, \frac{c}{b}^{\frac{1}{a}})

- \frac{( k _{1} + k _{2} ) ^{2} c ^{2}}{4Γ ( a ) ^{2}} γ (a, \frac{c}{b}^{\frac{1}{a}})^{2} - \frac{( k _{1} + k _{2} ) ^{2} b ^{2}}{4Γ ( a ) ^{2}} Γ (2 a, \frac{c}{b}^{\frac{1}{a}})^{2}

+ \frac{( k _{1}^{2} + k _{2}^{2} ) b ^{2} Γ ( 3 a )}{2Γ ( a )} + sgn (c) \frac{( k _{1}^{2} - k _{2}^{2} ) b ^{2}}{2Γ ( a )} γ (3 a, \frac{c}{b}^{\frac{1}{a}}) .

E [L (Z)]

E [L (Z)]

V [L (Z)]

erf (x) := \frac{2}{π} \int_{0}^{x} exp (- t^{2}) d t

erf (x) := \frac{2}{π} \int_{0}^{x} exp (- t^{2}) d t

E [L (Z + c)]

E [L (Z + c)]

V [L (Z + c)]

- sgn (c) (k_{1} + k_{2}) {L (c) + b (k_{1} - k_{2})} b exp (- \frac{c}{b})

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Statistical Methods and Inference · Bayesian Modeling and Causal Inference

Full text

Minimizing the expected value of the asymmetric loss and an inequality of the variance of the loss

Naoya Yamaguchi, Yuka Yamaguchi, and Ryuei Nishii

Abstract.

For some estimations and predictions, we solve minimization problems with asymmetric loss functions. Usually, we estimate the coefficient of regression for these problems. In this paper, we do not make such the estimation, but rather give a solution by correcting any predictions so that the prediction error follows a general normal distribution. In our method, we can not only minimize the expected value of the asymmetric loss, but also lower the variance of the loss.

Key words and phrases:

asymmetric loss function; risk function; minimizing the expectation value; generalized Gauss distribution; gamma function.

2010 Mathematics Subject Classification:

Primary 62A99; Secondary 62E99.

1. Introduction

For some estimations and predictions, we solve minimization problems with loss functions, as follows: Let $\{(x_{i},y_{i})\mid 1\leq i\leq n\}$ be a data set, where $x_{i}$ are $1\times p$ vectors and $y_{i}\in\mathbb{R}$ . We assume that the data relate to a linear model,

[TABLE]

where $y={}^{t}(y_{1},\ldots,y_{n})$ , $\varepsilon={}^{t}(\varepsilon_{1},\ldots,\varepsilon_{n})$ , and $X$ is the $n\times p$ matrix having $x_{i}$ as the $i$ th row. Let $L$ be a loss function and let $r_{i}(\beta):=y_{i}-x_{i}\beta$ . Then we estimate the value:

[TABLE]

The case of $L(r_{i}(\beta))=r_{i}(\beta)^{2}$ is well-known (see, e.g., Refs. [1], [8], and [10]). In the case of an asymmetric loss function, we refer the reader to, e.g., Refs. [3], [5], [7], and [14]. These studies estimate the parameter $\hat{\beta}$ . In this paper, however, we do not make such the estimation, but instead give a solution to the minimization problems by correcting any predictions so that the prediction error follows a general normal distribution. In our method, we can not only minimize the expected value of the asymmetric loss, but also lower the variance of the loss.

Let $y$ be an observation value, and let $\hat{y}$ be a predicted value of $y$ . We derive the optimized predicted value $y^{*}=\hat{y}+C$ minimizing the expected value of the loss under the assumption:

(1)

The prediction error $z:=\hat{y}-y$ is the realized value of a random variable $Z$ , whose density function is a generalized Gaussian distribution function (see, e.g., Refs. [4], [9], and [11]) with mean zero

[TABLE]

where $\Gamma(a)$ is the gamma function and $a$ , $b\in\mathbb{R}_{>0}$ . 2. (2)

Let $k_{1}$ , $k_{2}\in\mathbb{R}_{>0}$ . If there is a mismatch between $y$ and $\hat{y}$ , then we suffer a loss,

[TABLE]

That is, the solution to the minimization problem is

[TABLE]

The motivation of our research is as follows: (1) Predictions usually cause prediction errors. Therefore, it is necessary to use predictions in consideration of predictions errors. Actually, in some cases, it is best not to act as predicted because of prediction errors. For example, the paper [13] formulates a method for minimizing the expected value of the procurement cost of electricity in two popular spot markets: day-ahead and intra-day, under the assumption that the expected value of the unit prices and the distributions of the prediction errors for the electricity demand traded in two markets are known. The paper showed that if the procurement is increased or decreased from the prediction, in some cases, the expected value of the procurement cost is reduced. (2) In recent years, prediction methods have been black boxed by the big data and machine learning (see, e.g., Ref. [6]). The day will soon come, when we must minimize the objective function by using predictions obtained by such black boxed methods. In our method, even if we do not know the prediction $\hat{y}$ , we can determine the parameter $C$ if we know the prediction error distribution $f$ and asymmetric loss function $L$ .

To obtain $y^{*}$ , we derive $\operatorname{{E}}[L(Z+c)]$ for any $c\in\mathbb{R}$ . Let $\Gamma(a,x)$ and $\gamma(a,x)$ be the upper and the lower incomplete gamma functions, respectively (see, e.g., Ref. [12]). The expected value and the variance of $L(Z+c)$ are as follows:

Lemma 1.

For any $c\in\mathbb{R}$ , we have

[TABLE]

We write the value of $c$ satisfying $\frac{d}{dc}\operatorname{{E}}[L(Z+c)]=0$ as $C$ . Then, we find that $\operatorname{{E}}[L(Z+c)]$ has a minimum value at $c=C$ . Also, it follows from

[TABLE]

that $\operatorname{{sgn}}(C)=\operatorname{{sgn}}(k_{2}-k_{1})$ , where $\operatorname{{sgn}}(c):=1\>(c\geq 0);-1\>(c<0)$ , and $C=0$ only when $k_{1}=k_{2}$ . This equation implies that the ratio of $\Gamma(a)$ and $\gamma\left(a,\left\lvert\frac{C}{b}\right\rvert^{\frac{1}{a}}\right)$ is $1:\frac{\lvert k_{2}-k_{1}\rvert}{k_{1}+k_{2}}$ . That is, the vertical axis $t=\left\lvert\frac{C}{b}\right\rvert^{\frac{1}{a}}$ divides the area between $t^{a-1}e^{-t}$ and the $t$ -axis into $\frac{\lvert k_{2}-k_{1}\rvert}{k_{1}+k_{2}}:1-\frac{\lvert k_{2}-k_{1}\rvert}{k_{1}+k_{2}}$ .

Substituting $c=C$ in the equation $(1)$ of Lemma 1, from the equation $(3)$ , we have

[TABLE]

This is the minimum value of $\operatorname{{E}}[L(Z+c)]$ . From this and the $c=0$ case of the equation $(1)$ of Lemma 1, we have the following corollary:

Corollary 2.

We have

[TABLE]

This corollary asserts that the expected value of the loss is reduced by correcting a predicted value $y$ to the optimized predicted value $y^{*}$ . Moreover, the following holds:

Theorem 3.

We have

[TABLE]

where equality sign holds only when $C=0$ ; that is, when $k_{1}=k_{2}$ .

This theorem asserts that the variance of the loss is reduced by correcting the predicted value $y$ to the optimized predicted value $y^{*}$ . To prove this theorem, we use the following lemma:

Lemma 4.

For $a>0$ and $x>0$ , we have

[TABLE]

To prove Lemma 4, we use the following lemmas:

Lemma 5.

For $a>0$ , we have

[TABLE]

Lemma 6.

For $a>0$ , we have

[TABLE]

The remainder of this paper is organized as follows. In Section $2$ , we set up the problem. In Section $3$ , we introduce the expected value and the variance of $L(Z+c)$ , and we determine the value of $c=C$ that gives the minimum value of $\operatorname{{E}}[L(Z+c)]$ . In addition, we give a geometrical interpretation of the parameter $C$ , and give the minimized expected value $\operatorname{{E}}[L(Z+C)]$ . In Section $4$ , we prove Theorem 3. In Section $5$ , we give some inequalities for the gamma and the incomplete gamma functions, which used to derive the inequality for the variance of the loss in Theorem 3. In Section $6$ , we write the calculation of the expected value and the variance of the loss $L(Z+c)$ for $c\in\mathbb{R}$ .

2. Problem statement

In this section, we set a problem. Let $y$ be an observation value, let $\hat{y}$ be a predicted value of $y$ , and let $\Gamma(a)$ be the gamma function (see, e.g., Ref. [12, p. 93]) defined by

[TABLE]

We assume the following:

(1)

The prediction error $z:=\hat{y}-y$ is the realized value of a random variable $Z$ , whose density function is a generalized Gaussian distribution function (see, e.g., Refs. [4], [9], and [11]) with mean zero

[TABLE]

where $a$ , $b\in\mathbb{R}_{>0}$ . 2. (2)

Let $k_{1}$ , $k_{2}\in\mathbb{R}_{>0}$ . If there is a mismatch between $y$ and $\hat{y}$ , then we suffer a loss,

[TABLE]

We derive the optimized predicted value $y^{*}=\hat{y}+C$ minimizing $\operatorname{{E}}[L(Z+c)]$ . For this purpose, we derive $\operatorname{{E}}[L(Z+c)]$ for any $c\in\mathbb{R}$ in the next section.

3. Expected value and variance of the loss

Here, we introduce the expected value and the variance of $L(Z+c)$ , and determine the value of $c=C$ that gives the minimum value of $\operatorname{{E}}[L(Z+c)]$ . In addition, we give a geometrical interpretation of the parameter $C$ and give the minimized expected value $\operatorname{{E}}[L(Z+C)]$ .

3.1. Expected value and variance of the loss

Let $\Gamma(a,x)$ and $\gamma(a,x)$ be the upper and the lower incomplete gamma functions, respectively, defined by

[TABLE]

where $\text{Re}(a)>0$ and $x\geq 0$ . These functions have the following properties:

Lemma 7.

For ${\rm Re}(a)>0$ and $x\geq 0$ ,

[TABLE]

Also, for $c\in\mathbb{R}$ , let $\operatorname{{sgn}}(c):=1\>(c\geq 0);-1\>(c<0)$ . Then, the expected value and the variance of $L(Z+c)$ are as follows:

Lemma 8 (Section 1, Lemma 1).

For any $c\in\mathbb{R}$ , we have

[TABLE]

See the last two sections for the proof of Lemma 8.

From Lemma 8, we have the following:

[TABLE]

Let $\operatorname{{erf}}(x)$ be the error function defined by

[TABLE]

for any $x\in\mathbb{R}$ . We give two examples of $\operatorname{{E}}[L(Z+c)]$ and $\operatorname{{V}}[L(Z+c)]$ .

Example 9.

In the case of ${\rm Laplace}(0,b)$ , since $a=1$ , we have

[TABLE]

In the case of $\mathcal{N}(0,\frac{1}{2}b^{2})$ , since $a=\frac{1}{2}$ , we have

[TABLE]

With the conditions fixed as $k_{1}=50$ and $b=1$ , we can plot $\operatorname{{E}}[L(Z)]$ and $\operatorname{{V}}[L(Z)]$ for the Laplace and the Gauss distributions as follows:

3.2. Parameter value minimizing the expected value

Here, we determine the value of $c=C$ that gives the minimum value of $\operatorname{{E}}[L(Z+c)]$ . Since

[TABLE]

we have

[TABLE]

We will denote the value of $c$ satisfying $\frac{d}{dc}\operatorname{{E}}[L(Z+c)]=0$ as $C$ . Then, from the first derivative test, we find that $\operatorname{{E}}[L(Z+c)]$ has a minimum value at $c=C$ .

[TABLE]

Also, it follows from

[TABLE]

that $\operatorname{{sgn}}(C)=\operatorname{{sgn}}(k_{2}-k_{1})$ and $C=0$ only when $k_{1}=k_{2}$ .

Moreover, equation $(\ref{C})$ implies that the ratio of $\Gamma(a)$ and $\gamma\left(a,\left\lvert\frac{C}{b}\right\rvert^{\frac{1}{a}}\right)$ is $1:\frac{\lvert k_{2}-k_{1}\rvert}{k_{1}+k_{2}}$ . That is, the vertical axis $t=\left\lvert\frac{C}{b}\right\rvert^{\frac{1}{a}}$ divides the area between $t^{a-1}e^{-t}$ , and the $t$ -axis into $\frac{\lvert k_{2}-k_{1}\rvert}{k_{1}+k_{2}}:1-\frac{\lvert k_{2}-k_{1}\rvert}{k_{1}+k_{2}}$ .

Let $\operatorname{{erf}}^{-1}(x)$ be the inverse error function. We give two examples of $C$ .

Example 10.

In the case of ${\rm Laplace}(0,b)$ , since $a=1$ , we have

[TABLE]

In the case of $\mathcal{N}(0,\frac{1}{2}b^{2})$ , since $a=\frac{1}{2}$ , we have

[TABLE]

Fixing the conditions as $k_{1}=50$ and $b=1$ , we can plot $C$ for the Laplace and the Gauss distributions as follows:

3.3. Minimized expected value of the loss

We give the minimum value of $\operatorname{{E}}[L(Z+c)]$ . Substituting $c=C$ in equation $(1)$ of Lemma 8, from equation $(\ref{C})$ , we have

[TABLE]

This is the minimum value of $\operatorname{{E}}[L(Z+c)]$ . From this and equation $(\ref{E[L(Z)]})$ , we have the following corollary:

Corollary 11 (Section 1, Corollary 2).

We have

[TABLE]

Fixing the conditions as $k_{1}=50$ and $b=1$ , we can plot the plots of $\operatorname{{E}}[L(Z)]-\operatorname{{E}}[L(Z+C)]$ for the Laplace and the Gauss distributions as follows:

4. An inequality for the variance of the loss

In this section, we derive an inequality for the variance of $L(Z+c)$ . Let $C$ be the value of $c$ giving the minimum value of $\operatorname{{E}}[L(Z+c)]$ . Then, the following holds:

Theorem 12 (Section 1, Theorem 3).

We have

[TABLE]

where equality holds only when $C=0$ ; that is, when $k_{1}=k_{2}$ .

Fixing the conditions as $k_{1}=50$ and $b=1$ , we can plot $\operatorname{{V}}[L(Z)]-\operatorname{{V}}[L(Z+C)]$ for the Laplace and the Gauss distributions as follows:

To prove Theorem 12, we use the following lemma:

Lemma 13 (Section 1, Lemma 4).

For $a>0$ and $x>0$ , we have

[TABLE]

The proof of Lemma 13 is presented in Section $5.2$ . Now we can prove Theorem 12.

Proof of Theorem 12.

It follows from the equation $(\ref{C})$ that

[TABLE]

Hence, substituting $c=C$ in equation $(2)$ of Lemma 8, we have

[TABLE]

From this and equation $(\ref{V[L(Z)]})$ , we obtain

[TABLE]

where, for $a>0$ and $x\geq 0$ , $f(a,x)$ is defined as

[TABLE]

Here, since

[TABLE]

from Lemma 13, we have $\frac{d}{dx}f(a,x)>0$ ( $a>0$ , $x>0$ ). Also, $f(a,0)=0$ holds for $a>0$ . Therefore, we obtain

[TABLE]

where equality holds only when $C=0$ . Moreover, from equation $(\ref{C})$ , we find that $C=0$ holds only when $k_{1}=k_{2}$ . ∎

5. Inequalities for the gamma and the incomplete gamma functions

In this section, we give some inequalities for the gamma and the incomplete gamma functions, which we used to derive the inequality for the variance of the loss in Theorem 12.

5.1. Inequalities for the gamma function

To prove Lemma 13, we use the following:

Lemma 14 (Section 1, Lemma 5).

For $a>0$ , we have

[TABLE]

Next, to prove Lemma 14, we use the following:

Lemma 15 (Section 1, Lemma 6).

For $a>0$ , we have

[TABLE]

Furthermore, to prove Lemma 15, we need another lemma:

Lemma 16.

We have

[TABLE]

Proof.

Let $S_{n}:=\sum_{k=1}^{n}\frac{1}{k(2k-1)}$ . Accordingly, we have

[TABLE]

Therefore, we find

[TABLE]

The lemma is thus proved. ∎

Now we can prove Lemma 15.

Proof of Lemma 15.

Let

[TABLE]

To prove $g(a)>1$ for $a>0$ , we use the following formula [2, p.13, Theorem 1.2.5]:

[TABLE]

where $\gamma_{0}$ is Euler’s constant given by

[TABLE]

Taking the logarithmic derivative of $g(a)$ , from the above formula, we have

[TABLE]

for $a>0$ . Moreover, using Lemma 16, we obtain $\frac{d}{da}\log{g(a)}>0$ for $a>0$ . This leads to $\frac{d}{da}g(a)>0$ for $a>0$ . The lemma follows from this and $g(0)=1$ . ∎

Now, we can prove Lemma 14.

Proof of Lemma 14.

We use the following formula [2, p.22, Theorem 6.5.1]:

[TABLE]

From this and Lemma 15, we have

[TABLE]

The lemma is thus proved. ∎

5.2. Inequalities for the incomplete gamma functions

We will prove the following lemma:

Lemma 17 (Section $4$ , Lemma 13).

For $a>0$ and $x>0$ , we have

[TABLE]

To prove Lemma 17, we need to prove two other lemmas:

Lemma 18.

For $a>0$ and $x\geq 0$ , we have

[TABLE]

Proof.

For $a>0$ and $x\geq 0$ , we define

[TABLE]

Then, we have

[TABLE]

The lemma follows from this and $u(a,0)=0$ . ∎

Lemma 19.

For $a>0$ and $b\in\mathbb{R}$ , we have

[TABLE]

Proof.

When $b\leq 0$ , it is easily obtained from the definition of $\Gamma(a,x)$ . When $b>0$ , using the L’Hôpital’s rule, we obtain

[TABLE]

∎

Now, we can prove Lemma 17.

Proof of Lemma 17.

For $a>0$ and $x\geq 0$ , we define

[TABLE]

Let us prove $y_{1}(a,x)>0$ ( $a>0$ , $x>0$ ). For $a>0$ and $x\geq 0$ , we define

[TABLE]

Then, we have

[TABLE]

From these relations, we find that the (positive or negative) signs of $\frac{d}{dx}y_{i}(a,x)$ and $y_{i+1}(a,x)$ ( $i=1,2,3$ ) are equal to each other for $a>0$ and $x>0$ . Let $p_{i}(a)$ ( $i=2,3,4$ ) be the value of $x$ satisfying $y_{i}(a,x)=0$ . It is easily verified that $\lim_{x\to 0+}\frac{d}{dx}y_{4}(a,x)=\lim_{x\to+\infty}\frac{d}{dx}y_{4}(a,x)=\lim_{x\to 0+}y_{4}(a,x)=0$ and $\lim_{x\to+\infty}y_{4}(a,x)=a(a-1)\Gamma(a)$ for $a>0$ . Therefore, from the first derivative test, we obtain Tables $1$ and $2$ . Moreover, using Lemmas 18, 19, and L’Hôpital’s rule, we obtain

[TABLE]

From these results, Lemma 14, and the fact that the signs of $\frac{d}{dx}y_{i}(a,x)$ and $y_{i+1}(a,x)$ ( $i=1,2,3$ ) are equal to each other for $a>0$ and $x>0$ , we obtain Tables $3$ and $4$ . From Tables $3$ and $4$ , we can verify that $y_{1}(a,x)>0$ holds for $a>0$ and $x>0$ . This completes the proof of the lemma. ∎

6. Calculation of the expected value and the variance of the loss

Here, we calculate the expected value and the variance of the loss $L(Z+c)$ for $c\in\mathbb{R}$ .

6.1. Expected value of the loss

Here, let us put $\beta:=(2ab\Gamma(a))^{-1}$ ; then, we have

[TABLE]

Replace $z$ with $bz$ to get

[TABLE]

When $c\geq 0$ , we have

[TABLE]

When $c<0$ , we have

[TABLE]

From the above, for any $c\in\mathbb{R}$ , we have

[TABLE]

Now set $t:=z^{\frac{1}{a}}$ to get

[TABLE]

where $c^{\prime}:=\lvert c/b\rvert^{\frac{1}{a}}$ . Therefore, for any $c\in\mathbb{R}$ , we have

[TABLE]

6.2. Variance of the loss

Now let us calculate the variance of the loss $L(Z+c)$ for $c\in\mathbb{R}$ . Put $\beta:=(2ab\Gamma(a))^{-1}$ ; then, we have

[TABLE]

Replace $z$ with $bz$ to get

[TABLE]

When $c\geq 0$ , we have

[TABLE]

When $c<0$ , we have

[TABLE]

From the above, for any $c\in\mathbb{R}$ , we have

[TABLE]

Now set $t:=z^{\frac{1}{a}}$ to get

[TABLE]

where $c^{\prime}:=\lvert c/b\rvert^{\frac{1}{a}}$ . Therefore, for any $c\in\mathbb{R}$ , we have

[TABLE]

Also, from $(\ref{E[Pe(Z+c)]-2})$ , we have

[TABLE]

Therefore, for any $c\in\mathbb{R}$ , we have

[TABLE]

Bibliography14

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] John Aldrich. Doing least squares: Perspectives from gauss and yule. International Statistical Review , 66(1):61–81, 1998.
2[2] George E. Andrews, Richard Askey, and Ranjan Roy. Special Functions . Encyclopedia of Mathematics and its Applications. Cambridge University Press, 1999.
3[3] Jens Breckling and Ray Chambers. M-quantiles. Biometrika , 75(4):761–771, 1988.
4[4] Alex Dytso, Ronit Bustin, H. Vincent Poor, and Shlomo Shamai. Analytical properties of generalized gaussian distributions. Journal of Statistical Distributions and Applications , 5(1):6, Dec 2018.
5[5] B. Efron. Regression percentiles using asymmetric squared error loss. Statistica Sinica , 1(1):93–125, 1991.
6[6] Riccardo Guidotti, Anna Monreale, Franco Turini, Dino Pedreschi, and Fosca Giannotti. A survey of methods for explaining black box models. ACM Computing Surveys , 51, 02 2018.
7[7] Roger Koenker and Gilbert Bassett. Regression quantiles. Econometrica , 46(1):33–50, 1978.
8[8] A.M. Legendre. Nouvelles méthodes pour la détermination des orbites des comètes . Nineteenth Century Collections Online (NCCO): Science, Technology, and Medicine: 1780-1925. F. Didot, 1805.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Minimizing the expected value of the asymmetric loss and an inequality of the variance of the loss

Abstract.

Key words and phrases:

2010 Mathematics Subject Classification:

1. Introduction

Lemma 1**.**

Corollary 2**.**

Theorem 3**.**

Lemma 4**.**

Lemma 5**.**

Lemma 6**.**

2. Problem statement

3. Expected value and variance of the loss

3.1. Expected value and variance of the loss

Lemma 7**.**

Lemma 8** (Section 1, Lemma 1).**

Example 9**.**

3.2. Parameter value minimizing the expected value

Example 10**.**

3.3. Minimized expected value of the loss

Corollary 11** (Section 1, Corollary 2).**

4. An inequality for the variance of the loss

Theorem 12** (Section 1, Theorem 3).**

Lemma 13** (Section 1, Lemma 4).**

Proof of Theorem 12.

5. Inequalities for the gamma and the incomplete gamma functions

5.1. Inequalities for the gamma function

Lemma 14** (Section 1, Lemma 5).**

Lemma 15** (Section 1, Lemma 6).**

Lemma 16**.**

Proof.

Proof of Lemma 15.

Proof of Lemma 14.

5.2. Inequalities for the incomplete gamma functions

Lemma 17** (Section 444, Lemma 13).**

Lemma 18**.**

Proof.

Lemma 19**.**

Proof.

Proof of Lemma 17.

6. Calculation of the expected value and the variance of the loss

6.1. Expected value of the loss

6.2. Variance of the loss

Lemma 1.

Corollary 2.

Theorem 3.

Lemma 4.

Lemma 5.

Lemma 6.

Lemma 7.

Lemma 8 (Section 1, Lemma 1).

Example 9.

Example 10.

Corollary 11 (Section 1, Corollary 2).

Theorem 12 (Section 1, Theorem 3).

Lemma 13 (Section 1, Lemma 4).

Lemma 14 (Section 1, Lemma 5).

Lemma 15 (Section 1, Lemma 6).

Lemma 16.

Lemma 17 (Section $4$ , Lemma 13).

Lemma 18.

Lemma 19.