Strong Asymptotic Properties of Kernel Smoothing Estimation for NA   Random Variables with Right Censoring

Jianhua Shi; Jiansen Xu; Jinfeng Xu

arXiv:1901.05764·math.ST·February 2, 2023

Strong Asymptotic Properties of Kernel Smoothing Estimation for NA Random Variables with Right Censoring

Jianhua Shi, Jiansen Xu, Jinfeng Xu

PDF

Open Access

TL;DR

This paper extends kernel smoothing estimation methods to negatively associated random variables with right censoring, establishing their strong asymptotic properties to support practical applications in incomplete-data scenarios.

Contribution

It introduces the first strong asymptotic analysis of kernel estimators for NA variables under right censoring, relaxing previous ideal assumptions.

Findings

01

Established strong asymptotic properties of kernel density estimators

02

Validated the use of Kaplan-Meier based estimators in NA contexts

03

Provided theoretical justification for practical kernel smoothing in censored data

Abstract

Most studies for negatively associated (NA) random variables consider the complete-data situation, which is actually a relatively ideal condition in practice. The paper relaxes this condition to the incomplete-data setting and considers kernel smoothing density and hazard function estimation in the presence of right censoring based on the Kaplan-Meier estimator. We establish the strong asymptotic properties for these two estimators to assess their asymptotic behavior and justify their practical use.

Equations93

cov (f_{1} (T_{i}; i \in B_{1}), f_{2} (T_{j}; j \in B_{2})) \leq 0,

cov (f_{1} (T_{i}; i \in B_{1}), f_{2} (T_{j}; j \in B_{2})) \leq 0,

L_{n} (t) ≜ \frac{1}{n} k = 1 \sum n I (X_{k} \leq t) ≜ 1 - \frac{Y _{n} ( t )}{n} ≜ \frac{Y ˉ _{n} ( t )}{n},

L_{n} (t) ≜ \frac{1}{n} k = 1 \sum n I (X_{k} \leq t) ≜ 1 - \frac{Y _{n} ( t )}{n} ≜ \frac{Y ˉ _{n} ( t )}{n},

N_{n} (t) ≜ k = 1 \sum n I (T_{k} \leq t \land Y_{k}),

N_{n} (t) ≜ k = 1 \sum n I (T_{k} \leq t \land Y_{k}),

1 - F_{n} (x) ≜ t \leq x \prod (1 - \frac{d N _{n} ( t )}{Y _{n} ( t )}),

1 - F_{n} (x) ≜ t \leq x \prod (1 - \frac{d N _{n} ( t )}{Y _{n} ( t )}),

F_{*} (t) = \int_{0}^{\infty} [\int_{s \leq t \land z} d F (s)] d G (z)

F_{*} (t) = \int_{0}^{\infty} [\int_{s \leq t \land z} d F (s)] d G (z)

= \int_{0}^{t} F (z) d G (z) + \int_{t}^{\infty} F (t) d G (z)

= - \int_{0}^{t} F (z) d S_{Y} (z) + F (t) S_{Y} (t) = \int_{0}^{t} S_{Y} (z) d F (z),

d F_{*} (t) = S_{Y} (t) d F (t) .

d F_{*} (t) = S_{Y} (t) d F (t) .

h (t) ≜ \frac{d}{d t} (- lo g S_{T} (t)) = \frac{f ( t )}{S _{T} ( t )} f or S_{T} (t) > 0.

h (t) ≜ \frac{d}{d t} (- lo g S_{T} (t)) = \frac{f ( t )}{S _{T} ( t )} f or S_{T} (t) > 0.

H (t) ≜ \int_{0}^{t} h (s) d s = \int_{0}^{t} \frac{d F _{*} ( s )}{L ˉ ( s )} .

H (t) ≜ \int_{0}^{t} h (s) d s = \int_{0}^{t} \frac{d F _{*} ( s )}{L ˉ ( s )} .

H_{n} (t) ≜ \int_{0}^{t} \frac{d N _{n} ( s )}{Y _{n} ( s )} = \int_{0}^{t} \frac{d F _{* n} ( s )}{L ˉ _{n} ( s )},

H_{n} (t) ≜ \int_{0}^{t} \frac{d N _{n} ( s )}{Y _{n} ( s )} = \int_{0}^{t} \frac{d F _{* n} ( s )}{L ˉ _{n} ( s )},

F_{* n} (t) ≜ \frac{1}{n} k = 1 \sum n I (X_{k} \leq t, δ_{k} = 1) = \frac{N _{n} ( t )}{n}

F_{* n} (t) ≜ \frac{1}{n} k = 1 \sum n I (X_{k} \leq t, δ_{k} = 1) = \frac{N _{n} ( t )}{n}

1 - F_{n} (t) = X_{(k)} \leq t \prod (1 - \frac{d N _{n} ( X _{(k)} )}{n - k + 1}) = X_{(k)} \leq t \prod (1 - \frac{δ _{(k)}}{n - k + 1}),

1 - F_{n} (t) = X_{(k)} \leq t \prod (1 - \frac{d N _{n} ( X _{(k)} )}{n - k + 1}) = X_{(k)} \leq t \prod (1 - \frac{δ _{(k)}}{n - k + 1}),

H_{n} (t) = X_{(k)} \leq t \sum \frac{δ _{(k)}}{n - k + 1} .

H_{n} (t) = X_{(k)} \leq t \sum \frac{δ _{(k)}}{n - k + 1} .

\tilde{f}_{n} (t) ≜ b_{n}^{- 1} \int_{a_{F}}^{+ \infty} k (\frac{t - x}{b _{n}}) d F_{n} (x),

\tilde{f}_{n} (t) ≜ b_{n}^{- 1} \int_{a_{F}}^{+ \infty} k (\frac{t - x}{b _{n}}) d F_{n} (x),

\tilde{h}_{n} (t) ≜ b_{n}^{- 1} \int_{a_{F}}^{+ \infty} k (\frac{t - x}{b _{n}}) d H_{n} (x) .

\tilde{h}_{n} (t) ≜ b_{n}^{- 1} \int_{a_{F}}^{+ \infty} k (\frac{t - x}{b _{n}}) d H_{n} (x) .

f_{n} (t)

f_{n} (t)

f_{n}^{*} (t)

0 \leq t \leq τ sup F_{n} (t) - F (t) = O ((n^{- 1} ln n)^{1 / 2}) a . s .

0 \leq t \leq τ sup F_{n} (t) - F (t) = O ((n^{- 1} ln n)^{1 / 2}) a . s .

0 \leq t \leq τ sup H_{n} (t) - H (t) = O ((n^{- 1} ln n)^{1 / 2}) a . s .

0 \leq t \leq τ sup H_{n} (t) - H (t) = O ((n^{- 1} ln n)^{1 / 2}) a . s .

η (x, t, δ) ≜ \int_{0}^{t \land x} \frac{d F _{*} ( s )}{L ˉ ^{2} ( s )} - \frac{I ( x \leq t , δ = 1 )}{L ˉ ( x )} .

η (x, t, δ) ≜ \int_{0}^{t \land x} \frac{d F _{*} ( s )}{L ˉ ^{2} ( s )} - \frac{I ( x \leq t , δ = 1 )}{L ˉ ( x )} .

F_{n} (t) - F (t) = - \frac{S _{T} ( t )}{n} i = 1 \sum n η (X_{i}, t, δ_{i}) + r_{1 n} (t),

F_{n} (t) - F (t) = - \frac{S _{T} ( t )}{n} i = 1 \sum n η (X_{i}, t, δ_{i}) + r_{1 n} (t),

H_{n} (t) - H (t) = - \frac{1}{n} i = 1 \sum n η (X_{i}, t, δ_{i}) + r_{2 n} (t),

H_{n} (t) - H (t) = - \frac{1}{n} i = 1 \sum n η (X_{i}, t, δ_{i}) + r_{2 n} (t),

- \frac{1}{n} i = 1 \sum n η (X_{i}, t, δ_{i})

- \frac{1}{n} i = 1 \sum n η (X_{i}, t, δ_{i})

F_{n} (t) - F (t)

F_{n} (t) - F (t)

0 \leq t \leq τ sup ∣ F_{* n} (t) - F_{*} (t) ∣ = O ((n^{- 1} ln n)^{1 / 12 2}) a . s .

0 \leq t \leq τ sup ∣ F_{* n} (t) - F_{*} (t) ∣ = O ((n^{- 1} ln n)^{1 / 12 2}) a . s .

0 \leq t \leq τ sup ∣ L_{n} (t) - L (t) ∣ = O ((n^{- 1} ln n)^{1 / 12 2}) a . s .

0 \leq t \leq τ sup ∣ L_{n} (t) - L (t) ∣ = O ((n^{- 1} ln n)^{1 / 12 2}) a . s .

0 < t \leq τ sup \tilde{f}_{n} (t) - f_{n} (t) - \frac{f _{n}^{*} ( t ) - E f _{n}^{*} ( t )}{1 - G ( t )} = O (b_{n}^{- 1} (n^{- 1} ln n)^{1 / 12 2}) a . s .,

0 < t \leq τ sup \tilde{f}_{n} (t) - f_{n} (t) - \frac{f _{n}^{*} ( t ) - E f _{n}^{*} ( t )}{1 - G ( t )} = O (b_{n}^{- 1} (n^{- 1} ln n)^{1 / 12 2}) a . s .,

0 < t \leq τ sup \tilde{h}_{n} (t) - h_{n} (t) - \frac{f _{n}^{*} ( t ) - E f _{n}^{*} ( t )}{1 - L ( t )} = O (b_{n}^{- 1} (n^{- 1} ln n)^{1 / 12 2}) a . s .,

0 < t \leq τ sup \tilde{h}_{n} (t) - h_{n} (t) - \frac{f _{n}^{*} ( t ) - E f _{n}^{*} ( t )}{1 - L ( t )} = O (b_{n}^{- 1} (n^{- 1} ln n)^{1 / 12 2}) a . s .,

n \to \infty lim m : ∣ m - n ∣ \leq n ε sup \frac{b _{m}}{b _{n}} - 1 \to 0 f or ε \to 0,

n \to \infty lim m : ∣ m - n ∣ \leq n ε sup \frac{b _{m}}{b _{n}} - 1 \to 0 f or ε \to 0,

n \to \infty lim sup \pm (\frac{n b _{n}}{2 ln ln n})^{1 / 12 2} (\tilde{f}_{n} (t) - f_{n} (t)) = [φ (f, G) \int k^{2} (s) d s]^{1 / 12 2} a . s .

n \to \infty lim sup \pm (\frac{n b _{n}}{2 ln ln n})^{1 / 12 2} (\tilde{f}_{n} (t) - f_{n} (t)) = [φ (f, G) \int k^{2} (s) d s]^{1 / 12 2} a . s .

\tilde{f}_{n} (x) - f_{n} (x)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Statistical Distribution Estimation and Applications

Full text

Strong Asymptotic Properties of Kernel Smoothing Estimation for NA Random Variables with Right Censoring

Jian-hua Shia,b,c,d, Jian-sen Xua, Jin-feng Xua∗

aSchool of Mathematics and Statistics, Minnan Normal University, Zhangzhou, 363000, China

b Fujian Key Laboratory of Granular Computing and Applications, Zhangzhou, 363000, China

c The Institute of Meteorological Big Data-Digital Fujian, Zhangzhou, 363000, China

d Fujian Key Laboratory of Data Science and Statistics, Zhangzhou, 363000, China

Abstract

Most studies for negatively associated (NA) random variables consider the complete-data situation, which is actually a relatively ideal condition in practice. The paper relaxes this condition to the incomplete-data setting and considers kernel smoothing density and hazard function estimation in the presence of right censoring based on the Kaplan-Meier estimator. We establish the strong asymptotic properties for these two estimators to assess their asymptotic behavior and justify their practical use.

Keywords: Kaplan-Meier estimator; kernel smoothing estimator; NA random variable; right-censoring.

2000 Mathematics Subject Classification. 62G05

@normalsize††footnotetext: Corresponding author: Jin-feng Xu, Minnan Normal University, Zhangzhou, Fujian, China 363000. E-mail: [email protected].

1 Introduction

Negatively associated (NA) random sequence is a sequence of dependent random variables, which was first introduced by Alam and Saxena in 1981 and then delicately studied by Joag-Dev and Proschan in 1983. The definition is given as follows.

Definition (Joag-Dev and Proschan, 1983) Random sequences { ${{T}_{i}},1<i\leq n$ } are said to be negatively associated (NA) if for every pair of disjoint subsets ${{B}_{1}}$ and ${{B}_{2}}$ from $\left\{1,2,\ldots,n\right\}$ ,

[TABLE]

where there exists the covariance for ${{f}_{1}}(\cdot)$ and ${{f}_{2}}(\cdot)$ with increasing for every variable (or decreasing for every variable). A sequence of random variables $\left\{{{T}_{i}};i\geq 1\right\}$ is said to be NA if every finite subfamily is NA.

Clearly, independent random variable sequences are NA, and many other non-independent random sequences, for example the random sampling without replacement in a finite population, can also be included in NA category. Many researchers have studied the properties of NA random variables and obtained many important results. For example, Su et al. (1997) established a probability inequality and some moment inequalities for the partial sum of a NA sequence, which can be used to prove some properties for strictly stationary NA sequences such as weak invariance principle. Shao (2000) proved that most of the well-known inequalities, such as the Kolmogorov exponential inequality and the Rosenthal maximal inequality, still hold for NA random variables. Wu and Chen (2013) presented two strong representation results of the Kaplan-Meier estimator for NA data with censoring, which are most relevant to the main results in our paper. Zhou and Lin (2015) considered a nonparametric regression model with repeated NA error structures where the wavelet method is used to estimate the regression function. Thuan and Quang (2016) studied some properties for the NA random variables and obtained some inequalities including maximal inequality and H $\acute{a}$ jek- R $\acute{e}$ nyi’s type inequality. Tang et al. (2018) studied the asymptotic normality of the wavelet estimator in nonparametric regression, where the random errors are asymptotically NA random variables. Meng (2018) established two general strong laws of large numbers which also involve NA random variables.

Most studies for NA random variable are under a complete-data setting, however, which is actually a relatively ideal condition in practice. In survival analysis, right censoring is often encountered. For detailed discussion of the censoring and its practical relevance, please see Gijbels and Wang (1993), Zhou and Yip(1999), Chen et al. (2015), Qiu et al. (2015), Shi et al. (2018), Ma et al. (2019), Zhang and Zhou (2018) among many others for reference.

Let $(T_{i},Y_{i}),i=1,\ldots,n,$ denote a sequence of random vectors where $T_{i}\geq 0$ is the true survival time of interest, which is right censored by the random variable $Y_{i}\geq 0$ . It is assumed that $T_{i}$ is independent of $Y_{i}$ , but the i.i.d. assumption is not made for $T_{i}$ ’s and $Y_{i}$ ’s, which are both NA in our paper. The observations consist of $({{X}_{i}},{{\delta}_{i}})$ , where

${{X}_{i}}\triangleq\min({{T}_{i}},{{Y}_{i}})\triangleq{{T}_{i}}\wedge{{Y}_{i}}$ and ${{\delta}_{i}}\triangleq I({{T}_{i}}\leq{{Y}_{i}}),\ i=1,\ldots,n,$

and $I(A)$ is the indicator function of the random event $A.$ For simplicity, assume that $T_{i}$ have a common continuous marginal distribution function $F(t)\triangleq P({{T}_{i}}\leq t)$ and let its survival distribution $S_{T}(t)\triangleq 1-F(t)$ . The random censoring times ${{Y}_{i}},i=1,\ldots,n$ , being independent of the random variables ${{T}_{i}}$ ’s , are assumed to have a common distribution function $G(t)\triangleq P({{Y}_{i}}\leq t)$ with its survival distribution $S_{Y}(t)\triangleq 1-G(t)$ . Meanwhile, let $L(\cdot)$ be the distribution of the observed variable ${{X}_{i}}$ ’s, and we write its survival distribution as $\bar{L}(t)\triangleq 1-L(t).$ For any distribution function $H(\cdot)$ , we define the left and right endpoints of its support as ${{a}_{H}}$ and ${{\tau}_{H}}$ by ${{a}_{H}}\triangleq\inf\{t:H(t)>0\},{{\tau}_{H}}\triangleq\sup\{t:H(t)<1\}$ throughout the paper.

The distribution function $L(\cdot)$ can be consistently estimated by the empirical distribution function ${{L}_{n}}(t)$ , which is defined as follows.

[TABLE]

where ${{Y}_{n}}(t)\triangleq\sum\limits_{k=1}^{n}{I({{X}_{k}}>t)}$ denotes the number of uncensored and censored observations larger than time $t$ , and ${{\bar{Y}}_{n}}(t)\triangleq\sum\limits_{k=1}^{n}{I({{X}_{k}}\leq t)}$ .

For drawing nonparametric inference about unknown $F(\cdot)$ based on the censored observations $(X_{i},\delta_{i}),i=1,\ldots,n,$ we introduce a stochastic process on $[0,\infty)$ as follows

[TABLE]

which counts the number of uncensored observations no larger than time $t$ . One nonparametric maximum likelihood estimation ${{\widehat{F}}_{n}}(\cdot)$ of $F(\cdot)$ is the well-known Kaplan-Meier (K-M) estimator (Kaplan and Meier, 1958), which is commonly used to estimate $F(\cdot)$ for the incomplete data $({{X}_{i}},{{\delta}_{i}})$ , i.e.

[TABLE]

where the jump $d{{N}_{n}}(t)\triangleq{{N}_{n}}(t)-{{N}_{n}}(t-).$

Define the sub-distribution function ${{F}_{*}}(t)\triangleq P({{X}_{1}}\leq t,{{\delta}_{1}}=1).$ Since $F(0)=0$ , using integration by parts, we have

[TABLE]

and then

[TABLE]

It is further assumed that $F(\cdot)$ has a density function $f(\cdot)$ . The estimation for the hazard function $h(\cdot)$ is also of substantial interest in survival analysis, which is defined as

[TABLE]

Its correspond cumulative hazard function is defined as

[TABLE]

The representation (1.1) of $H(\cdot)$ in terms of ${F}_{*}(\cdot)$ and $\bar{L}(\cdot)$ suggests the empirical estimator for $H(\cdot)$ by

[TABLE]

where ${{\bar{L}}_{n}}(t)\triangleq 1-{{L}_{n}}(t),$ and

[TABLE]

denotes the empirical estimator of ${{F}_{*}(\cdot)}$ .

Note that $d{N_{n}}({X_{(k)}})=\sum\limits_{j=1}^{n}{[{\delta_{j}}I({X_{j}}={X_{(k)}})]}\triangleq{\delta_{(k)}},k=1,2,\ldots,n,$ where ${{X}_{(1)}}\leq{{X}_{(2)}}\leq\ldots\leq{{X}_{(n)}}$ are the order statistics of ${{X}_{1}},{{X}_{2}},\ldots,{{X}_{n}}$ , and ${{\delta}_{(k)}}$ is the concomitant of ${{X}_{(k)}}$ . It can be verified that the estimators ${{\widehat{F}}_{n}}(\cdot)$ and ${\widehat{H}_{n}}(\cdot)$ can be respectively represented as

[TABLE]

and

[TABLE]

In the case of right censoring, the K-M estimator ${{\widehat{F}}_{n}}(x)$ and the estimator ${{\widehat{H}}_{n}}(x)$ have been generally accepted as a substitute for the usual empirical estimators of distribution function $F(\cdot)$ and the cumulative hazard function $H(\cdot)$ , respectively, which help to study other estimators such as the kernel density estimator and the kernel hazard estimator in the following.

A kernel smoothed estimator for $f(\cdot)$ based on ${\widehat{F}}_{n}(\cdot)$ can be constructed as

[TABLE]

where $k(\cdot)$ is a smooth probability kernel function and $\{{b_{n}},n\geq 1\}$ is a sequence of bandwidth tending to zero at appropriate rates.

Similarly, we can also construct a kernel smoothed estimator for the hazard function $h(\cdot)$ under the NA sampling random variables, which is defined by

[TABLE]

The estimators ${\tilde{f}_{n}}(\cdot)$ and ${\tilde{h}_{n}}(\cdot)$ have attracted the attention of many investigators. For example, Mielniczuk (1986) investigated kernel estimator of a density function using the K-M estimator for censored data. When the data was sampled from $\alpha$ -mixing and censoring, Cai (1998) explored the uniform consistency (with rates) and the asymptotic normality of the kernel estimators for density and hazard function. Zhou (1999) successfully established several asymptotic uniformly strong and weak representations for kernel estimators of the density function and the hazard function under left truncation. Antoniadis et al. (1999) proposed a wavelet method for estimating density and hazard rate functions from randomly right-censored data. Some other results, one may refer to Diehl and Stute (1988), Gijbels and Wang (1993), Arcones and Gin $\acute{e}$ (1995), Zhou and Yip (1999), Lemdani and Ould-Saïd (2007), Shen and He (2008) among others.

To present our main results, define

[TABLE]

The main purpose of this paper is to study the asymptotic properties of kernel smoothing density estimator $\tilde{f}_{n}(\cdot)$ and hazard estimator ${\tilde{h}_{n}}(\cdot)$ based on censoring NA random variables. Under certain regularity conditions, we establish the strong asymptotic properties for the two estimators with the convergent rates $O(b_{n}^{-1}{{({{n}^{-1}}\ln n)}^{{1}/{2}\;}})\ a.s.$ , where $\{{b_{n}},n\geq 1\}$ will be defined in Section 2. Throughout the paper, the sequences of variables $\{{{T}_{n}};n\geq 1\}$ and $\{{{Y}_{n}};n\geq 1\}$ are all non-negative unless otherwise specified.

2 Main results and their proofs

We first present two lemmas (Wu and Chen, 2013) that will help to prove our theorems.

Lemma 1

Let $\{{{T}_{n}};n\geq 1\}$ and $\{{{Y}_{n}};n\geq 1\}$ be two sequences of NA random variables. Suppose that the sequences $\{{{T}_{n}};n\geq 1\}$ and $\{{{Y}_{n}};n\geq 1\}$ are independent. Then, for any $0<\tau<{{\tau}_{L}}={{\tau}_{F}}\wedge{{\tau}_{G}},$

[TABLE]

and

[TABLE]

For positive real numbers $x$ and $~{}t$ , write

[TABLE]

Lemma 2

Let $\{{{T}_{n}};n\geq 1\}$ and $\{{{Y}_{n}};n\geq 1\}$ be two sequences of NA random variables. Suppose that the sequences $\{{{T}_{n}};n\geq 1\}$ and $\{{{Y}_{n}};n\geq 1\}$ are independent. Then, for any $0<\tau<{{\tau}_{L}},$

[TABLE]

and

[TABLE]

where $\underset{0\leq t\leq\tau}{\mathop{\sup}}\,\left|{{r}_{in}}(t)\right|=O({{({{n}^{-1}}\ln n)}^{{1}/{2}\;}})\ a.s.,\ i=1,2$ .

Remark 1

Note that by the definition of $\eta(x,t,\delta)$ ,

[TABLE]

Therefore, we can obtain by Lemma 2 that

[TABLE]

One can establish the following lemma by noting that $\{{X_{n}};n\geq 1\}$ and $\{({X_{n}},{\delta_{n}});n\geq 1\}$ are both sequences of NA random variables according to Joag-Dev and Proschan (1983).

Lemma 3

Under the conditions of Lemma 1, for any $0<\tau<{{\tau}_{L}},$ there are

[TABLE]

and

[TABLE]

Theorem 1

Under the conditions of Lemma 1, assume that the kernel density $k(t)$ has bounded variation on some finite interval $(r,s)$ with $k(t)=0$ for $t\notin(r,s)$ , where $-\infty<r<0<s<\infty$ . Suppose that density distribution $f(\cdot)$ and $g(\cdot)={G}^{\prime}(\cdot)$ are bounded on the closed interval $[0,\tau]$ for some $\tau\in(a_{F},{{\tau}_{L}}).$ Then there is

[TABLE]

where the sequence $\{{{b}_{n}};n\geq 1\}$ satisfies $b_{n}^{-1}=o({(n{\ln^{-1}}n)^{{1\mathord{\left/{\vphantom{12}}\right.\kern-1.2pt}2}}})$ .

Theorem 2

Under the conditions of Lemma 1, assume that the kernel density $k(t)$ has bounded variation on some finite interval $(r,s)$ with $k(t)=0$ for $t\notin(r,s)$ , where $-\infty<r<0<s<\infty.$ Suppose that density distribution $f(\cdot)$ and $g(\cdot)={G}^{\prime}(\cdot)$ are bounded on the closed interval $[0,\tau]$ for some $\tau\in(a_{F},{{\tau}_{L}}).$ Then there is

[TABLE]

where the sequence $\{{{b}_{n}};n\geq 1\}$ satisfies $b_{n}^{-1}=o({(n{\ln^{-1}}n)^{{1\mathord{\left/{\vphantom{12}}\right.\kern-1.2pt}2}}})$ .

Remark 2

Theorem 1 and Theorem 2 are key results in studying censored NA sequences, which can be useful in deriving some asymptotic properties for the kernel density estimator $\tilde{f}_{n}(\cdot)$ and the hazard function estimator $\tilde{h}_{n}(\cdot)$ , respectively. For example, if one can establish the results similar to those in Hall (1981) for NA sequences, then by using Theorem 1, the following proposition will hold.

Proposition 1 Suppose that the sequence $\{{{b}_{n}};n\geq 1\}$ satisfies ${b_{n}}\to 0$ for $n\to\infty$ , and

(a) ${{{{(\ln n)}^{2}}}\mathord{\left/{\vphantom{{{{(\ln n)}^{2}}}{(n{b_{n}}\ln\ln n)}}}\right.\kern-1.2pt}{(n{b_{n}}\ln\ln n)}}\to 0,$

(b) $n{b_{n}}\to\infty$ in such a way that

[TABLE]

then there will be

[TABLE]

where $\varphi(f,G)$ is some functional for $f(\cdot)$ and $G(\cdot)$ .

For simplicity and without loss of generality, it can be assumed that $a_{F}=0$ in the following proof.

Proof of Theorem 1 According to Remark 1, ${{\tilde{f}}_{n}}(x)-{{{f}}_{n}}(x)$ can be expressed as

[TABLE]

Considering ${{I}_{1}}$ , we have

[TABLE]

Thus, we have the following formula

[TABLE]

Using the partial integration for ${{I}_{11}}$ , we get

[TABLE]

Note that the kernel function $k(\cdot)$ is zero outside the interval $(r,s)$ and the fact that $G(\cdot)$ is monotone. Then, when $n$ is large enough, by (2.5) and the definition of $\tau$ , applying the change of variable formula, we have

[TABLE]

and

[TABLE]

Again, note that

[TABLE]

Integrating by parts for ${{I}_{12}}$ , we have for $0<\tau<{{\tau}_{L}}$ ,

[TABLE]

Since density function $f(\cdot)$ and $g(\cdot)$ are bounded in the closed interval $[0,\tau]$ , which means that ${\bar{L}}^{\prime}(s)=f(s)(1-G(s))+g(s)(1-F(s))$ is also bounded in the interval $[0,\tau]$ , and hence

[TABLE]

where $M$ is some positive constant number.

Thus, combining equations (2.10) - (2.13), we have

[TABLE]

On the other hand, similar to the discussion of ${{I}_{12}}$ ,

[TABLE]

where

[TABLE]

and

[TABLE]

It can be obtained by (2.5) that

[TABLE]

As for term ${{I}_{3}}$ , note that $k(\cdot)$ is of bounded variation, it follows from Lemma 2 that

[TABLE]

This completes the proof by combining (2.9) and (2.14)- (2.16).

Proof of Theorem 2 Note by the strong asymptotic expression from (2.4),

[TABLE]

and similarly for the term $I_{11}$ , we have

[TABLE]

Then following the proofs of Theorem 1, we can also obtain Theorem 2. This completes our proof.

Bibliography26

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Antoniadis, A., Gr e ´ ´ 𝑒 \acute{e} goire, G., Nason, G. P. (1999) Density and hazard rate estimation for right-censored data by using wavelet methods. Journal of the Royal Statistical Society, Series B , 61(1), 63-84.
2[2] Alam K., K. M. Lai Saxena (1981) Positive dependence in multivariate distributions, Communications in Statistics-Theory and Methods , 10:12, 1183-1196.
3[3] Arcones, M. A., Gin e ´ ´ 𝑒 \acute{e} , E. (1995) On the law of the iterated logarithm for canonical u-statistics and processes. Stochastic Processes and Their Applications , 58(2), 217-245.
4[4] Cai, Z. (1998) Asymptotic properties of Kaplan-Meier estimator for censored dependent data. Statistics and Probability Letters , 37(4), 381-389.
5[5] Chen, X., Shi, J. and Zhou, Y.(2015). Monotone rank estimation of transformation models with length-biased and right-censored data. SCIENCE CHINA Mathematics(English series) , 58(10), 2055-2068.
6[6] Diehl, S., Stute, W. (1988) Kernel density and hazard function estimation in the presence of censoring. Journal of Multivariate Analysis , 25(2), 299-310.
7[7] Gijbels, I., Wang, J. (1993) Strong representations of the survival function estimator for truncated and censored-data with applications. Journal of Multivariate Analysis , 47(2), 210-229.
8[8] Joag-Dev, K., Proschan, F. (1983) Negative association of random variables with applications. The Annals of Statistics , 11(1), 286-295.