Estimation in the convolution structure density model. Part II:   adaptation over the scale of anisotropic classes

Oleg Lepski; Thomas Willer

arXiv:1704.04420·math.ST·April 17, 2017

Estimation in the convolution structure density model. Part II: adaptation over the scale of anisotropic classes

Oleg Lepski, Thomas Willer

PDF

Open Access

TL;DR

This paper advances adaptive minimax estimation in the convolution structure density model over anisotropic Nikol'skii classes, highlighting the impact of boundedness on risk and proposing an near-optimal adaptive estimator.

Contribution

It fully characterizes the minimax risk behavior across different parameters and introduces a selection rule for constructing nearly optimal adaptive estimators.

Findings

01

Boundedness of the function improves minimax risk asymptotics.

02

The proposed selection rule yields near-optimal adaptive estimators.

03

The behavior of minimax risk varies with regularity and norm parameters.

Abstract

This paper continues the research started in \cite{LW16}. In the framework of the convolution structure density model on $\bR^{d}$ , we address the problem of adaptive minimax estimation with $\bL_{p}$ --loss over the scale of anisotropic Nikol'skii classes. We fully characterize the behavior of the minimax risk for different relationships between regularity parameters and norm indexes in the definitions of the functional class and of the risk. In particular, we show that the boundedness of the function to be estimated leads to an essential improvement of the asymptotic of the minimax risk. We prove that the selection rule proposed in Part I leads to the construction of an optimally or nearly optimally (up to logarithmic factor) adaptive estimator.

Equations344

p = (1 - α) f + α [f ⋆ g], f \in F_{g} (R), α \in [0, 1],

p = (1 - α) f + α [f ⋆ g], f \in F_{g} (R), α \in [0, 1],

\big{[}f\star g\big{]}(x)=\int_{{\mathbb{R}}^{d}}f(x-z)g(z)\nu_{d}({\rm d}z),\;\;x\in{\mathbb{R}}^{d},

\big{[}f\star g\big{]}(x)=\int_{{\mathbb{R}}^{d}}f(x-z)g(z)\nu_{d}({\rm d}z),\;\;x\in{\mathbb{R}}^{d},

\mathbb{F}_{g}(R)=\Big{\{}f\in\mathbb{B}_{1,d}(R):\;(1-\alpha)f+\alpha[f\star g]\in\mathfrak{P}\big{(}{\mathbb{R}}^{d}\big{)}\Big{\}}.

\mathbb{F}_{g}(R)=\Big{\{}f\in\mathbb{B}_{1,d}(R):\;(1-\alpha)f+\alpha[f\star g]\in\mathfrak{P}\big{(}{\mathbb{R}}^{d}\big{)}\Big{\}}.

{\cal R}^{(p)}_{n}[\hat{f},f]:=\Big{(}\mathbb{E}_{f}\|\hat{f}-f\|_{p}^{p}\Big{)}^{1/p},\;p\in[1,\infty),

{\cal R}^{(p)}_{n}[\hat{f},f]:=\Big{(}\mathbb{E}_{f}\|\hat{f}-f\|_{p}^{p}\Big{)}^{1/p},\;p\in[1,\infty),

\phi_{n}(\mathbb{F}):=\inf_{\tilde{f}_{n}}{\cal R}^{(p)}_{n}\big{[}\tilde{f}_{n};\mathbb{F}\big{]}.

\phi_{n}(\mathbb{F}):=\inf_{\tilde{f}_{n}}{\cal R}^{(p)}_{n}\big{[}\tilde{f}_{n};\mathbb{F}\big{]}.

\limsup_{n\to\infty}\phi^{-1}_{n}(\mathbb{F}_{\vartheta}){\cal R}^{(p)}_{n}\big{[}\hat{f}_{n};\mathbb{F}_{\vartheta}\big{]}<\infty,\;\;\forall\vartheta\in\Theta?

\limsup_{n\to\infty}\phi^{-1}_{n}(\mathbb{F}_{\vartheta}){\cal R}^{(p)}_{n}\big{[}\hat{f}_{n};\mathbb{F}_{\vartheta}\big{]}<\infty,\;\;\forall\vartheta\in\Theta?

\limsup_{n\to\infty}\phi^{-1}_{\frac{n}{\ln n}}(\mathbb{F}_{\vartheta}){\cal R}^{(p)}_{n}\big{[}\hat{f}_{n};\mathbb{F}_{\vartheta}\big{]}<\infty,\;\;\forall\vartheta\in\Theta.

\limsup_{n\to\infty}\phi^{-1}_{\frac{n}{\ln n}}(\mathbb{F}_{\vartheta}){\cal R}^{(p)}_{n}\big{[}\hat{f}_{n};\mathbb{F}_{\vartheta}\big{]}<\infty,\;\;\forall\vartheta\in\Theta.

\mathbb{F}_{\vartheta}={\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathbb{F}_{g,\mathbf{\infty}}(R,Q),\;\;\vartheta=\big{(}\vec{\beta},\vec{r},\vec{L},R,Q\big{)},

\mathbb{F}_{\vartheta}={\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathbb{F}_{g,\mathbf{\infty}}(R,Q),\;\;\vartheta=\big{(}\vec{\beta},\vec{r},\vec{L},R,Q\big{)},

\mathbb{F}_{g,\mathbf{\infty}}(R,Q):=\big{\{}f\in\mathbb{F}_{g}(R):\;(1-\alpha)f+\alpha[f\star g]\in\mathbb{B}_{\mathbf{\infty},d}(Q)\big{\}},

\mathbb{F}_{g,\mathbf{\infty}}(R,Q):=\big{\{}f\in\mathbb{F}_{g}(R):\;(1-\alpha)f+\alpha[f\star g]\in\mathbb{B}_{\mathbf{\infty},d}(Q)\big{\}},

\mathbb{F}_{\vartheta}={\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathbb{F}_{g}(R)\cap\mathbb{B}_{\infty,d}(Q),\;\vartheta=\big{(}\vec{\beta},\vec{r},\vec{L},R,Q\big{)}.

\mathbb{F}_{\vartheta}={\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathbb{F}_{g}(R)\cap\mathbb{B}_{\infty,d}(Q),\;\vartheta=\big{(}\vec{\beta},\vec{r},\vec{L},R,Q\big{)}.

Z_{i} = X_{i} + ϵ_{i} Y_{i}, i = 1, \dots, n,

Z_{i} = X_{i} + ϵ_{i} Y_{i}, i = 1, \dots, n,

\displaystyle\Upsilon_{1}\prod_{j=1}^{d}(1+t^{2}_{j})^{-\frac{\mu_{j}}{2}}\leq\big{|}\check{g}(t)\big{|}\leq\Upsilon_{2}\prod_{j=1}^{d}(1+t^{2}_{j})^{-\frac{\mu_{j}}{2}},\quad\forall t=(t_{1},\ldots,t_{d})\in{\mathbb{R}}^{d}.

\displaystyle\Upsilon_{1}\prod_{j=1}^{d}(1+t^{2}_{j})^{-\frac{\mu_{j}}{2}}\leq\big{|}\check{g}(t)\big{|}\leq\Upsilon_{2}\prod_{j=1}^{d}(1+t^{2}_{j})^{-\frac{\mu_{j}}{2}},\quad\forall t=(t_{1},\ldots,t_{d})\in{\mathbb{R}}^{d}.

\displaystyle\sup_{J\in\mathfrak{J}^{*}}\int_{{\mathbb{R}}^{d}}g(z)\Big{(}\prod_{j\in J}z^{2}_{j}\Big{)}{\rm d}z<\infty.

\displaystyle\sup_{J\in\mathfrak{J}^{*}}\int_{{\mathbb{R}}^{d}}g(z)\Big{(}\prod_{j\in J}z^{2}_{j}\Big{)}{\rm d}z<\infty.

\frac{1}{β ( α )} = j = 1 \sum d \frac{2 μ _{j} ( α ) + 1}{β _{j}}, \frac{1}{ω ( α )} = j = 1 \sum d \frac{2 μ _{j} ( α ) + 1}{β _{j} r _{j}}, L (α) = j = 1 \prod d L_{j}^{\frac{2 μ _{j} ( α ) + 1}{β _{j}}} .

\frac{1}{β ( α )} = j = 1 \sum d \frac{2 μ _{j} ( α ) + 1}{β _{j}}, \frac{1}{ω ( α )} = j = 1 \sum d \frac{2 μ _{j} ( α ) + 1}{β _{j} r _{j}}, L (α) = j = 1 \prod d L_{j}^{\frac{2 μ _{j} ( α ) + 1}{β _{j}}} .

ϰ_{α} (s) = ω (α) (2 + 1/ β (α)) - s τ (s) = 1 - 1/ ω (0) + 1/ (s β (0)) .

ϰ_{α} (s) = ω (α) (2 + 1/ β (α)) - s τ (s) = 1 - 1/ ω (0) + 1/ (s β (0)) .

ϱ (α)

ϱ (α)

δ_{n}

δ_{n}

\liminf_{n\to\infty}\;\inf_{\tilde{f}_{n}}\sup_{f\in{\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathbb{F}_{g}(R)}\boldsymbol{\delta}_{n}^{-\varrho(\alpha)}{\cal R}^{(n)}_{p}\big{[}\tilde{f}_{n};f\big{]}\geq c,

\liminf_{n\to\infty}\;\inf_{\tilde{f}_{n}}\sup_{f\in{\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathbb{F}_{g}(R)}\boldsymbol{\delta}_{n}^{-\varrho(\alpha)}{\cal R}^{(n)}_{p}\big{[}\tilde{f}_{n};f\big{]}\geq c,

ρ (α)

ρ (α)

\liminf_{n\to\infty}\;\inf_{\tilde{f}_{n}}\sup_{f\in{\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathfrak{P}\big{(}{\mathbb{R}}^{d}\big{)}\cap\mathbb{B}_{\infty,d}(Q)}\boldsymbol{\delta}_{n}^{-\rho(\alpha)}{\cal R}^{(n)}_{p}\big{[}\tilde{f}_{n};f\big{]}\geq c,

\liminf_{n\to\infty}\;\inf_{\tilde{f}_{n}}\sup_{f\in{\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathfrak{P}\big{(}{\mathbb{R}}^{d}\big{)}\cap\mathbb{B}_{\infty,d}(Q)}\boldsymbol{\delta}_{n}^{-\rho(\alpha)}{\cal R}^{(n)}_{p}\big{[}\tilde{f}_{n};f\big{]}\geq c,

\big{|}1-\alpha+\alpha\check{g}(t)\big{|}\geq 1-\alpha,\quad\forall t\in{\mathbb{R}}^{d}.

\big{|}1-\alpha+\alpha\check{g}(t)\big{|}\geq 1-\alpha,\quad\forall t\in{\mathbb{R}}^{d}.

K_{\vec{h}}(t)=V^{-1}_{\vec{h}}K\big{(}t_{1}/h_{1},\ldots,t_{d}/h_{d}\big{)},\;t\in{\mathbb{R}}^{d}.

K_{\vec{h}}(t)=V^{-1}_{\vec{h}}K\big{(}t_{1}/h_{1},\ldots,t_{d}/h_{d}\big{)},\;t\in{\mathbb{R}}^{d}.

\displaystyle K_{\vec{h}}(y)=(1-\alpha)M\big{(}y,\vec{h}\big{)}+\alpha\int_{{\mathbb{R}}^{d}}g(t-y)M\big{(}t,\vec{h}\big{)}{\rm d}t,\quad y\in{\mathbb{R}}^{d}.

\displaystyle K_{\vec{h}}(y)=(1-\alpha)M\big{(}y,\vec{h}\big{)}+\alpha\int_{{\mathbb{R}}^{d}}g(t-y)M\big{(}t,\vec{h}\big{)}{\rm d}t,\quad y\in{\mathbb{R}}^{d}.

\displaystyle\widehat{f}_{\vec{\mathrm{h}}}(x)=n^{-1}\sum_{i=1}^{n}M\big{(}Z_{i}-x,\vec{\mathrm{h}}\big{)},\qquad\widehat{\sigma}^{2}\big{(}x,\vec{\mathrm{h}}\big{)}=\frac{1}{n}\sum_{i=1}^{n}M^{2}\big{(}Z_{i}-x,\vec{\mathrm{h}}\big{)};

\displaystyle\widehat{f}_{\vec{\mathrm{h}}}(x)=n^{-1}\sum_{i=1}^{n}M\big{(}Z_{i}-x,\vec{\mathrm{h}}\big{)},\qquad\widehat{\sigma}^{2}\big{(}x,\vec{\mathrm{h}}\big{)}=\frac{1}{n}\sum_{i=1}^{n}M^{2}\big{(}Z_{i}-x,\vec{\mathrm{h}}\big{)};

\displaystyle\widehat{U}_{n}\big{(}x,\vec{\mathrm{h}}\big{)}=\sqrt{\frac{2\lambda_{n}\big{(}\vec{\mathrm{h}}\big{)}\widehat{\sigma}^{2}\big{(}x,\vec{\mathrm{h}}\big{)}}{n}}+\frac{4M_{\infty}\lambda_{n}\big{(}\vec{\mathrm{h}}\big{)}}{3n\prod_{j=1}^{d}\mathrm{h}_{j}(\mathrm{h}_{j}\wedge 1)^{\boldsymbol{\mu}_{j}(\alpha)}},

\displaystyle\lambda_{n}\big{(}\vec{\mathrm{h}}\big{)}=4\ln(M_{\infty})+6\ln{(n)}+(8p+26)\sum_{j=1}^{d}\big{[}1+\boldsymbol{\mu}_{j}(\alpha)\big{]}\big{|}\ln(\mathrm{h}_{j})\big{|}.

\displaystyle\lambda_{n}\big{(}\vec{\mathrm{h}}\big{)}=4\ln(M_{\infty})+6\ln{(n)}+(8p+26)\sum_{j=1}^{d}\big{[}1+\boldsymbol{\mu}_{j}(\alpha)\big{]}\big{|}\ln(\mathrm{h}_{j})\big{|}.

\displaystyle\widehat{{\cal R}}_{\vec{h}}(x)=\sup_{\vec{\eta}\in\mathbb{H}}\Big{[}\big{|}\widehat{f}_{\vec{h}\vee\vec{\eta}}(x)-\widehat{f}_{\vec{\eta}}(x)\big{|}-4\widehat{U}_{n}\big{(}x,\vec{h}\vee\vec{\eta}\big{)}-4\widehat{U}_{n}\big{(}x,\vec{\eta}\big{)}\Big{]}_{+};

\displaystyle\widehat{{\cal R}}_{\vec{h}}(x)=\sup_{\vec{\eta}\in\mathbb{H}}\Big{[}\big{|}\widehat{f}_{\vec{h}\vee\vec{\eta}}(x)-\widehat{f}_{\vec{\eta}}(x)\big{|}-4\widehat{U}_{n}\big{(}x,\vec{h}\vee\vec{\eta}\big{)}-4\widehat{U}_{n}\big{(}x,\vec{\eta}\big{)}\Big{]}_{+};

\displaystyle\widehat{U}^{*}_{n}\big{(}x,\vec{h}\big{)}=\sup_{\vec{\eta}\in\mathbb{H}:\;\vec{\eta}\geq\vec{h}}\widehat{U}_{n}\big{(}x,\vec{\eta}\big{)},

Δ_{u, j} G (x) = G (x + u e_{j}) - G (x), j = 1, \dots, d .

Δ_{u, j} G (x) = G (x + u e_{j}) - G (x), j = 1, \dots, d .

Δ_{u, j}^{k} G (x) = Δ_{u, j} Δ_{u, j}^{k - 1} G (x) = l = 1 \sum k (- 1)^{l + k} (l k) Δ_{u l, j} G (x) .

Δ_{u, j}^{k} G (x) = Δ_{u, j} Δ_{u, j}^{k - 1} G (x) = l = 1 \sum k (- 1)^{l + k} (l k) Δ_{u l, j} G (x) .

\Big{\|}\Delta_{u,j}^{k_{j}}G\Big{\|}_{r_{j}}\leq L_{j}|u|^{\beta_{j}},\;\;\;\;\forall u\in{\mathbb{R}},\;\;\;\forall j=1,\ldots,d.

\Big{\|}\Delta_{u,j}^{k_{j}}G\Big{\|}_{r_{j}}\leq L_{j}|u|^{\beta_{j}},\;\;\;\;\forall u\in{\mathbb{R}},\;\;\;\forall j=1,\ldots,d.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHydrocarbon exploration and reservoir analysis

Full text

Estimation in the convolution structure density model. Part II: adaptation over the scale of anisotropic classes.

O.V. Lepski label=e1][email protected] [

T. Willer label=e2][email protected] [ Aix Marseille Univ, CNRS, Centrale Marseille, I2M, Marseille, France

Institut de Mathématique de Marseille

Aix-Marseille Université

39, rue F. Joliot-Curie

13453 Marseille, France

Abstract

This paper continues the research started in Lepski and Willer (2016). In the framework of the convolution structure density model on ${\mathbb{R}}^{d}$ , we address the problem of adaptive minimax estimation with ${\mathbb{L}}_{p}$ –loss over the scale of anisotropic Nikol’skii classes. We fully characterize the behavior of the minimax risk for different relationships between regularity parameters and norm indexes in the definitions of the functional class and of the risk. In particular, we show that the boundedness of the function to be estimated leads to an essential improvement of the asymptotic of the minimax risk. We prove that the selection rule proposed in Part I leads to the construction of an optimally or nearly optimally (up to logarithmic factor) adaptive estimator.

62G05, 62G20,

deconvolution model,

density estimation,

oracle inequality,

adaptive estimation,

kernel estimators,

${\mathbb{L}}_{p}$ –risk,

anisotropic Nikol’skii class,

keywords:

[class=AMS]

keywords:

\startlocaldefs\endlocaldefs

t1This work has been carried out in the framework of the Labex Archimède (ANR-11-LABX-0033) and of the A*MIDEX project (ANR-11-IDEX-0001-02), funded by the ”Investissements d’Avenir” French Government program managed by the French National Research Agency (ANR).

1 Introduction

In the present paper we will be interested in the adaptive estimation in the convolution structure density model. Our considerations here continue the research started in Lepski and Willer (2016).

Thus, we observe i.i.d. vectors $Z_{i}\in{\mathbb{R}}^{d},i=1,\ldots,n,$ with a common probability density $\mathfrak{p}$ satisfying the following structural assumption

[TABLE]

where $\alpha\in[0,1]$ and $g:{\mathbb{R}}^{d}\to{\mathbb{R}}$ are supposed to be known and $f:{\mathbb{R}}^{d}\to{\mathbb{R}}$ is the function to be estimated. Recall that for two functions $f,g\in{\mathbb{L}}_{1}\big{(}{\mathbb{R}}^{d}\big{)}$

[TABLE]

and for any $\alpha\in[0,1]$ , $g\in{\mathbb{L}}_{1}\big{(}{\mathbb{R}}^{d}\big{)}$ and $R>1$ ,

[TABLE]

Furthermore $\mathfrak{P}\big{(}{\mathbb{R}}^{d}\big{)}$ denotes the set of probability densities on ${\mathbb{R}}^{d}$ , $\mathbb{B}_{s,d}(R)$ is the ball of radius $R>0$ in ${\mathbb{L}}_{s}\big{(}{\mathbb{R}}^{d}\big{)}:={\mathbb{L}}_{s}\big{(}{\mathbb{R}}^{d},\nu_{d}\big{)},1\leq s\leq\infty$ and $\nu_{d}$ is the Lebesgue measure on ${\mathbb{R}}^{d}$ . At last, for any $U\in{\mathbb{L}}_{1}\big{(}{\mathbb{R}}^{d}\big{)}$ let $\check{U}(t):=\int_{{\mathbb{R}}^{d}}U(x)e^{-i\sum_{j=1}^{d}x_{j}t_{j}}\nu_{d}({\rm d}x),t\in{\mathbb{R}}^{d},$ be the Fourier transform of $U$ .

The convolution structure density model (1.1) will be studied for an arbitrary $g\in{\mathbb{L}}_{1}\big{(}{\mathbb{R}}^{d}\big{)}$ and $f\in\mathbb{F}_{g}(R)$ . Then, except in the case $\alpha=0$ , the function $f$ is not necessarily a probability density.

We want to estimate $f$ using the observations $Z^{(n)}=(Z_{1},\ldots,Z_{n})$ . By estimator, we mean any $Z^{(n)}$ -measurable map $\hat{f}:{\mathbb{R}}^{n}\to{\mathbb{L}}_{p}\big{(}{\mathbb{R}}^{d}\big{)}$ . The accuracy of an estimator $\hat{f}$ is measured by the ${\mathbb{L}}_{p}$ –risk

[TABLE]

where $\mathbb{E}_{f}$ denotes the expectation with respect to the probability measure ${\mathbb{P}}_{f}$ of the observations $Z^{(n)}=(Z_{1},\ldots,Z_{n})$ . Also, $\|\cdot\|_{p}$ , $p\in[1,\infty)$ , is the ${\mathbb{L}}_{p}$ -norm on ${\mathbb{R}}^{d}$ . The objective is to construct an estimator of $f$ with a small ${\mathbb{L}}_{p}$ –risk.

1.1 Adaptive estimation

Let $\mathbb{F}$ be a given subset of ${\mathbb{L}}_{p}\big{(}{\mathbb{R}}^{d}\big{)}$ . For any estimator $\tilde{f}_{n}$ define its maximal risk by ${\cal R}^{(p)}_{n}\big{[}\tilde{f}_{n};\mathbb{F}\big{]}=\sup_{f\in\mathbb{F}}{\cal R}^{(p)}_{n}\big{[}\tilde{f}_{n};f\big{]}$ and its minimax risk on $\mathbb{F}$ is given by

[TABLE]

Here, the infimum is taken over all possible estimators. An estimator whose maximal risk is bounded by $\phi_{n}(\mathbb{F})$ up to some constant factor is called minimax on $\mathbb{F}$ .

Let $\big{\{}\mathbb{F}_{\vartheta},\vartheta\in\Theta\big{\}}$ be a collection of subsets of ${\mathbb{L}}_{p}\big{(}{\mathbb{R}}^{d},\nu_{d}\big{)}$ , where $\vartheta$ is a nuisance parameter which may have a very complicated structure.

The problem of adaptive estimation can be formulated as follows: is it possible to construct a single estimator $\hat{f}_{n}$ which would be simultaneously minimax on each class $\mathbb{F}_{\vartheta},\;\vartheta\in\Theta$ , i.e.

[TABLE]

We refer to this question as *the problem of minimax adaptive estimation over the scale * $\{\mathbb{F}_{\vartheta},\;\vartheta\in\Theta\}$ . If such an estimator exists, we will call it optimally adaptive. Using the modern statistical language we call the estimator $\hat{f}_{n}$ nearly optimally adaptive if

[TABLE]

We will be interested in adaptive estimation over the scale

[TABLE]

where ${\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}$ is the anisotropic Nikolskii class, see Definition 1 below. As it was explained in Part I, the adaptive estimation over the scale $\big{\{}{\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)},\;\big{(}\vec{\beta},\vec{r},\vec{L}\big{)}\in(0,\infty)^{d}\times[1,\infty]^{d}\times(0,\infty)^{d}\big{\}}$ can be viewed as the adaptation to anisotropy and inhomogeneity of the function to be estimated. Recall also that

[TABLE]

so $f\in\mathbb{F}_{g,\mathbf{\infty}}(R,Q)$ simply means that the common density of observations $\mathfrak{p}$ is uniformly bounded by $Q$ . It is easy to see that if $\alpha=1$ and $\|g\|_{\infty}<\infty$ then $\mathbb{F}_{g,\mathbf{\infty}}(R,Q)=\mathbb{F}_{g}(R)$ for any $Q\geq R\|g\|_{\infty}$ .

Let us briefly discuss another example. Let $r>1$ and $L<\infty$ be arbitrary but a priory chosen numbers. Assume that the considered collection of anisotropic Nikol’skii classes obeys the following restrictions: $\vec{r}\in[r,\infty]^{d}$ and $\vec{L}\in(0,L]^{d}$ . Suppose also that $\|g\|_{s}<\infty$ , where $1/s=1-1/r$ . Then, there exists $Q_{0}$ completely determined by $r,L$ and $R$ such that ${\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathbb{F}_{g,\mathbf{\infty}}(R,Q)={\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathbb{F}_{g}(R)$ for any $Q>Q_{0}\|g\|_{s}$ .

Additionally, we will study the adaptive estimation over the collection

[TABLE]

We will show that the boundedness of the underlying function allows to improve considerably the accuracy of estimation.

1.2 Historical notes

The minimax adaptive estimation is a very active area of mathematical statistics and the interested reader can find a very detailed overview as well as several open problems in adaptive estimation in the recent paper, Lepski (2015). Below we will discuss only the articles whose results are relevant to our consideration, i.e. the density setting under ${\mathbb{L}}_{p}$ -loss, from a minimax perspective.

Let us start with the following remark. If one assumes additionally that $f,g\in\mathfrak{P}\big{(}{\mathbb{R}}^{d}\big{)}$ the convolution structure density model can be interpreted as follows. The observations $Z_{i}\in{\mathbb{R}}^{d},i=1,\ldots,n,$ can be written as a sum of two independent random vectors, that is,

[TABLE]

where $X_{i},i=1,\ldots,n,$ are i.i.d. $d$ -dimensional random vectors with common density $f$ to be estimated. The noise variables $Y_{i},i=1,\ldots,n,$ are i.i.d. $d$ -dimensional random vectors with known common density $g$ . At last $\varepsilon_{i}\in\{0,1\},i=1,\ldots,n,$ are i.i.d. Bernoulli random variables with ${\mathbb{P}}(\varepsilon_{1}=1)=\alpha$ , where $\alpha\in[0,1]$ is supposed to be known. The sequences $\{X_{i},i=1,\ldots,n\}$ , $\{Y_{i},i=1,\ldots,n\}$ and $\{\epsilon_{i},i=1,\ldots,n\}$ are supposed to be mutually independent.

The observation scheme (1.3) can be viewed as the generalization of two classical statistical models. Indeed, the case $\alpha=1$ corresponds to the standard deconvolution model $Z_{i}=X_{i}+Y_{i},\;i=1,\ldots,n$ . Another ”extreme” case $\alpha=0$ correspond to the direct observation scheme $Z_{i}=X_{i},\;i=1,\ldots,n$ . The ”intermediate” case $\alpha\in(0,1)$ , considered for the first time in Hesse (1995), is understood as partially contaminated observations.

Direct case, $\alpha=0$

There is a vast literature dealing with minimax and minimax adaptive density estimation, see for example, Efroimovich (1986), Hasminskii and Ibragimov (1990), Golubev (1992), Donoho et al. (1996), Devroye and Lugosi (1997), Rigollet (2006), Rigollet and Tsybakov (2007), Samarov and Tsybakov (2007), Birgé (2008), Giné and Nickl (2009), Akakpo (2012), Gach et al. (2013), Lepski (2013), among many others. Special attention was paid to the estimation of densities with unbounded support, see Juditsky and Lambert–Lacroix (2004), Reynaud–Bouret et al. (2011). The most developed results can be found in Goldenshluger and Lepski (2011), Goldenshluger and Lepski (2014) and in Section 2 we will compare in detail our results with those obtained in these papers.

Intermediate case, $\alpha\in(0,1)$

To the best of our knowledge, adaptive estimation in the case of partially contaminated observations has not been studied yet. We were able to find only two papers dealing with minimax estimation. The first one is Hesse (1995) (where the discussed model was introduced in dimension $1$ ) in which the author evaluated the ${\mathbb{L}}_{\infty}$ -risk of the proposed estimator over a functional class formally corresponding to the Nikol’skii class ${\mathbb{N}}_{\infty,1}(2,1)$ . In Yuana and Chenb (2002) the latter result was developed to the multidimensional setting, i.e. to the minimax estimation on ${\mathbb{N}}_{\infty,d}\big{(}\vec{2},1\big{)}$ . The most intriguing fact is that the accuracy of estimation in partially contaminated noise is the same as in the direct observation scheme. However none of these articles studied the optimality of the proposed estimators. We will come back to the aforementioned papers in Section 1.3.1 in order to compare the assumptions imposed on the noise density $g$ .

Deconvolution case, $\alpha=1$

First let us remark that the behavior of the Fourier transform of the function $g$ plays an important role in all the works dealing with deconvolution. Indeed ill-posed problems correspond to Fourier transforms decaying towards zero. Our results will be established for ”moderately” ill posed problems, so we detail only results in papers studying that type of operators. This assumption means that there exist $\vec{\mu}=(\mu_{1},\ldots,\mu_{d})\in(0,\infty)^{d}$ and $\Upsilon_{1}>0,\Upsilon_{2}>0$ such that the Fourier transform $\check{g}$ of $g$ satisfies:

[TABLE]

Some minimax and minimax adaptive results in dimension 1 over different classes of smooth functions can be found in particular in Stefanski and Carroll (1990), Fan (1991), Fan (1993), Pensky and Vidakovic (1999), Fan and Koo (2002), Comte and al. (2006), Hall and Meister (2007), Meister (2009), Lounici and Nickl (2011), Kerkyacharian et al. (2011).

There are very few results in the multidimensional setting. It seems that Masry (1993) was the first paper where the deconvolution problem was studied for multivariate densities. It is worth noting that Masry (1993) considered more general weakly dependent observations and this paper formally does not deal with the minimax setting. However the results obtained in this paper could be formally compared with the estimation under ${\mathbb{L}}_{\infty}$ -loss over the isotropic Hölder class of regularity $2$ , i.e. ${\mathbb{N}}_{\infty,d}\big{(}\vec{2},1\big{)}$ which is exactly the same setting as in Yuana and Chenb (2002) in the case of partially contaminated observations. Let us also remark that there is no lower bound result in Masry (1993). The most developed results in the deconvolution model were obtained in Comte and Lacour (2013) and Rebelles (2016) and in Section 2 we will compare in detail our results with those obtained in these papers.

1.3 Lower bound for the minimax ${\mathbb{L}}_{p}$ -risk

We have seen that the problem of optimal adaptation over the collection $\big{\{}\mathbb{F}_{\vartheta},\vartheta\in\Theta\big{\}}$ is formulated as the ”attainability” of the family of minimax risks $\big{\{}\phi_{n}(\mathbb{F}_{\vartheta}),\vartheta\in\Theta\big{\}}$ by a single estimator. Although it is not necessary, the following ”two-stage” approach is used for the majority of problems related to the minimax adaptive estimation. The first step consists in finding a lower bound for $\phi_{n}(\mathbb{F}_{\vartheta})$ for any $\vartheta\in\Theta$ while the second one consists in constructing an estimator ”attaining”, at least asymptotically, this bound. We adopt this strategy in our investigations and below we present several lower bound results recently obtained in Lepski and Willer (2017).

1.3.1 Assumptions on the function $g$ imposed in Lepski and Willer (2017)

Let $\mathfrak{J}^{*}$ denote the set of all subsets of $\{1,\ldots,d\}$ . Set $\mathfrak{J}=\mathfrak{J}^{*}\cup\emptyset$ and for any $J\in\mathfrak{J}$ let $|J|$ denote the cardinality of $J$ while $\{j_{1}<\cdots<j_{|J|}\}$ denotes its elements.

For any $J\in\mathfrak{J}^{*}$ define the operator $\mathfrak{D}^{J}=\frac{\partial^{|J|}}{\partial t_{j_{1}}\cdots\partial t_{j_{|J|}}}$ and let $\mathfrak{D}^{\emptyset}$ denote the identity operator. For any $I,J\in\mathfrak{J}$ define $\mathfrak{D}^{I,J}=\mathfrak{D}^{I}\big{(}\mathfrak{D}^{J}\big{)}$ and note that obviously $\mathfrak{D}^{I,J}=\mathfrak{D}^{J,I}$ .

Assumption 1 ( $\alpha\neq 1$ ).

$\mathfrak{D}^{J}\check{g}$ * exists for any $J\in\mathfrak{J}^{*}$ and $\;\sup_{J\in\mathfrak{J}^{*}}\big{\|}\mathfrak{D}^{J}\check{g}\big{\|}_{\infty}<\infty;$ *

Assumption 2 ( $\alpha=1$ ).

$\mathfrak{D}^{J}\check{g}$ * exists for any $J\in\mathfrak{J}^{*}$ and $\sup_{J\in\mathfrak{J}^{*}}\big{\|}\check{g}^{-1}\mathfrak{D}^{J}\check{g}\big{\|}_{\infty}<\infty.$ Moreover, there exists $\vec{\mu}=(\mu_{1},\ldots,\mu_{d})\in(0,\infty)^{d}$ and $\Upsilon>0$ such that*

$|\check{g}(t)|\leq\Upsilon\prod_{j=1}^{d}(1+t^{2}_{j})^{-\frac{\mu_{j}}{2}},\quad\forall t=(t_{1},\ldots,t_{d})\in{\mathbb{R}}^{d}.$

Assumption 3 ( $\alpha=1$ ).

$g$ * is a bounded function.*

Assumption 4 ( $\alpha=1$ ).

$\mathfrak{D}^{I,J}\check{g}$ * exists for any $I,J\in\mathfrak{J}$ and $\;\sup_{I,J\in\mathfrak{J}}\big{\|}\mathfrak{D}^{I,J}\big{(}\check{g}\big{)}\big{\|}_{1}<\infty$ . Moreover*

[TABLE]

It is worth noting that all the bounds in Lepski and Willer (2017) are obtained under Assumptions 1 and 2. Assumption 3 is used when the estimation of unbounded functions is considered; we come back to this assumption in Section 2.4.2.

As to Assumption 4, it seems purely technical and does not appear in upper bound results. We also recall that the lower bounds in Lepski and Willer (2017) are proved under the condition: $g\in\mathfrak{P}\big{(}{\mathbb{R}}^{d}\big{)}$ .

1.3.2 Some lower bounds from Lepski and Willer (2017)

Set $\vec{\boldsymbol{\mu}}(\alpha)=\vec{\mu}$ , $\alpha=1$ , $\vec{\boldsymbol{\mu}}(\alpha)=(0,\ldots,0)$ , $\alpha\in[0,1)$ , and introduce for any $\vec{\beta}\in(0,\infty)^{d}$ , $\vec{r}\in[1,\infty]^{d}$ and $\vec{L}\in(0,\infty)^{d}$ the following quantities.

[TABLE]

Define for any $1\leq s\leq\infty$ and $\alpha\in[0,1]$

[TABLE]

General case. Remind that $z(\alpha)=\omega(\alpha)(2+1/\beta(\alpha))\beta(0)\tau(\infty)+1$ , $p^{*}=\big{[}\max_{l=1,\ldots,d}r_{l}\big{]}\vee p$ . Set

[TABLE]

Here and later we assume $0/0=0$ , which implies in particular that $\frac{\omega(\alpha)(1-p^{*}/p)}{\varkappa_{\alpha}(p^{*})}=0$ if $p^{*}=p$ and $\varkappa_{\alpha}(p)=0$ . Recall also that $\varkappa_{\alpha}(p^{*})/p^{*}=-1$ if $p^{*}=\infty$ . Put at last

[TABLE]

Theorem 1 (Lepski and Willer (2017)).

Let $L_{0}>0$ and $1\leq p<\infty$ be fixed.

Then for any $\vec{\beta}\in(0,\infty)^{d},\;\vec{r}\in[1,\infty]^{d}$ , $\vec{L}\in[L_{0},\infty)^{d}$ , $\vec{\mu}\in(0,\infty)^{d}$ , $R>1$ and $g\in\mathfrak{P}\big{(}{\mathbb{R}}^{d}\big{)}$ , satisfying Assumptions 1–4, there exists $c>0$ independent of $\vec{L}$ such that

[TABLE]

where the infimum is taken over all possible estimators.

Following the terminology used in Lepski and Willer (2017), we will call the set of parameters satisfying $\varkappa_{\alpha}(p)>p\omega(\alpha)$ the tail zone, satisfying $0<\varkappa_{\alpha}(p)\leq p\omega(\alpha)$ the dense zone and satisfying $\varkappa_{\alpha}(p)\leq 0$ the sparse zone. In its turn, the latter zone is divided into two sub-domains: the sparse zone 1 corresponding to $\tau(p^{*})>0$ and the sparse zone 2 corresponding to $\tau(p^{*})\leq 0$ .

Bounded case. Introduce

[TABLE]

Theorem 2 (Lepski and Willer (2017)).

Let $L_{0}>0$ and $1\leq p<\infty$ be fixed.

Then for any $\vec{\beta}\in(0,\infty)^{d},\;\vec{r}\in[1,\infty]^{d}$ , $\vec{L}\in[L_{0},\infty)^{d}$ , $Q>0$ , $\vec{\mu}\in(0,\infty)^{d}$ and $g\in\mathfrak{P}\big{(}{\mathbb{R}}^{d}\big{)}$ , satisfying Assumptions 1 and 2 there exists $c>0$ independent of $\vec{L}$ such that

[TABLE]

where the infimum is taken over all possible estimators.

1.4 Assumptions on the function $g$

The selection rule from the family of linear estimators, the ${\mathbb{L}}_{p}$ -norm oracle inequalities obtained in Part I and all the adaptive results presented in the paper are established under the following condition imposed on the function $g$ .

Assumption 5.

(1) if $\alpha\neq 1$ then there exists $\varepsilon>0$ such that

$\big{|}1-\alpha+\alpha\check{g}(t)\big{|}\geq\varepsilon,\quad\forall t\in{\mathbb{R}}^{d};$

(2) if $\alpha=1$ then there exists $\vec{\mu}=(\mu_{1},\ldots,\mu_{d})\in(0,\infty)^{d}$ and $\Upsilon_{0}>0$ such that

$|\check{g}(t)|\geq\Upsilon_{0}\prod_{j=1}^{d}(1+t^{2}_{j})^{-\frac{\mu_{j}}{2}},\quad\forall t=(t_{1},\ldots,t_{d})\in{\mathbb{R}}^{d}.$

Comparing this condition with Assumption 2 from Section 1.3.1, we can assert that both are coherent if $\alpha=1$ . Indeed, in this case, we come the following assumption, which is well-known in the literature:

$\Upsilon_{0}\prod_{j=1}^{d}(1+t^{2}_{j})^{-\frac{\mu_{j}}{2}}\leq|\check{g}(t)|\leq\Upsilon\prod_{j=1}^{d}(1+t^{2}_{j})^{-\frac{\mu_{j}}{2}},\quad\forall t\in{\mathbb{R}}^{d}.$

referred to as a moderately ill-posed statistical problem, cf. (1.4). In particular, the assumption is checked for the centered multivariate Laplace law.

Note first that Assumption 5 is in some sense weaker than Assumption 1 when $\alpha\in(0,1)$ , since it does not require regularity properties of the function $g$ . Moreover both assumptions are not too restrictive. They are verified for many distributions, including centered multivariate Laplace and Gaussian ones. Note also that Assumption 5 always holds with $\varepsilon=1-2\alpha$ if $\alpha<1/2$ . Additionally, it holds with $\varepsilon=1-\alpha$ if $\check{g}$ is a real positive function. The latter is true, in particular, for any probability law obtained by an even number of convolutions of a symmetric distribution with itself.

Next, our Assumption 5 is weaker than the conditions imposed in Hesse (1995) and Yuana and Chenb (2002). In these papers $\check{g}\in\mathbb{C}^{(2)}\big{(}{\mathbb{R}}^{d}\big{)}$ , $\check{g}(t)\neq 0$ for any $t\in{\mathbb{R}}^{d}$ and

[TABLE]

2 Adaptive estimation over the scale of anisotropic Nikol’skii classes

We start this section by recalling the definition of the pointwise selection rule proposed in Part I.

2.1 Pointwise selection rule

Let $K:{\mathbb{R}}^{d}\to{\mathbb{R}}$ be a continuous function belonging to ${\mathbb{L}}_{1}\big{(}{\mathbb{R}}^{d}\big{)}$ such that $\int_{{\mathbb{R}}}K=1$ . Set ${\cal H}=\big{\{}e^{k},\;k\in{\mathbb{Z}}\big{\}}$ and let ${\cal H}^{d}=\big{\{}\vec{h}=(h_{1},\ldots,h_{d}):\;h_{j}\in{\cal H},j=1,\ldots,d\big{\}}.$ Recall that ${\cal H}^{d}_{\text{isotr}}=\big{\{}\vec{h}\in{\cal H}^{d}:\;\vec{h}=(h,\ldots,h),\;h\in{\cal H}\big{\}}.$ Set $V_{\vec{h}}=\prod_{j=1}^{d}h_{j}$ and let for any $\vec{h}\in{\cal H}^{d}$

[TABLE]

Later on for any $u,v\in{\mathbb{R}}^{d}$ the operations and relations $u/v$ , $uv$ , $u\vee v$ , $u\wedge v$ , $u\geq v$ , $au,a\in{\mathbb{R}},$ are understood in coordinate-wise sense. In particular $u\geq v$ means that $u_{j}\geq v_{j}$ for any $j=1,\ldots,d$ .

For any $\vec{h}\in(0,\infty)^{d}$ let $M\big{(}\cdot,\vec{h}\big{)}$ satisfy the operator equation

[TABLE]

Introduce for any $\vec{\mathrm{h}}\in{\cal H}^{d}$ and $x\in{\mathbb{R}}^{d}$

[TABLE]

where $M_{\infty}=\big{[}(2\pi)^{-d}\big{\{}\varepsilon^{-1}\big{\|}\check{K}\big{\|}_{1}\mathrm{1}_{\alpha\neq 1}+\Upsilon_{0}^{-1}\mathbf{k}_{1}\mathrm{1}_{\alpha=1}\big{\}}\big{]}\vee 1$ and

[TABLE]

Let $\mathbb{H}$ be an arbitrary subset of ${\cal H}^{d}$ . For any $\vec{h}\in\mathbb{H}$ and $x\in{\mathbb{R}}^{d}$ introduce

[TABLE]

and define $\vec{\mathbf{h}}(x)=\arg\inf_{\vec{h}\in\mathbb{H}}\Big{[}\widehat{{\cal R}}_{\vec{h}}(x)+8\widehat{U}^{*}_{n}\big{(}x,\vec{h}\big{)}\Big{]}.$

Our final estimator is $\widehat{f}_{\vec{\mathbf{h}}(x)}(x),\;x\in{\mathbb{R}}^{d}$ and we will call (2.2) the pointwise selection rule.

Remark 1.

Note that the estimator $\widehat{f}_{\vec{\mathbf{h}}}$ depends on $\mathbb{H}$ and later on we will consider two choices of the parameter set $\mathbb{H}$ , namely $\mathbb{H}={\cal H}^{d}$ and $\mathbb{H}={\cal H}^{d}_{\text{isotr}}$ . So, to present our results we will write $\widehat{f}_{\vec{\mathbf{h}},\mathbb{H}}$ in order to underline the aforementioned dependence. The choice $\mathbb{H}={\cal H}^{d}$ will be used when the adaptation is studied over anisotropic Nikol’skii classes while $\mathbb{H}={\cal H}^{d}_{\text{isotr}}$ will be used when the considered scale consists of isotropic classes.

2.2 Anisotropic Nikol’skii classes

Let $(\mathbf{e}_{1},\ldots,\mathbf{e}_{d})$ denote the canonical basis of ${\mathbb{R}}^{d}$ . For some function $G:{\mathbb{R}}^{d}\to{\mathbb{R}}^{1}$ and real number $u\in{\mathbb{R}}$ define the first order difference operator with step size $u$ in direction of the variable $x_{j}$ by

[TABLE]

By induction, the $k$ -th order difference operator with step size $u$ in direction of the variable $x_{j}$ is defined as

[TABLE]

Definition 1.

For given vectors $\vec{r}=(r_{1},\ldots,r_{d})\in[1,\infty]^{d}$ $\vec{\beta}=(\beta_{1},\ldots,\beta_{d})\in(0,\infty)^{d}$ and $\vec{L}=(L_{1},\ldots,L_{d})\in(0,\infty)^{d}$ we say that a function $G:{\mathbb{R}}^{d}\to{\mathbb{R}}^{1}$ belongs to the anisotropic Nikolskii class ${\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}$ if

(i)* $\|G\|_{r_{j}}\leq L_{j}$ for all $j=1,\ldots,d$ ;*

(ii)* for every $j=1,\ldots,d$ there exists natural number $k_{j}>\beta_{j}$ such that*

[TABLE]

If $\beta_{j}=\boldsymbol{\beta}\in(0,\infty),r_{j}=\mathbf{r}\in[1,\infty]$ and $L_{j}=\mathbf{L}\in(0,\infty)$ for any $j=1,\ldots,d$ the corresponding Nikolskii class, denoted furthermore ${\mathbb{N}}_{\mathbf{r},d}(\boldsymbol{\beta},\mathbf{L})$ , is called isotropic.

2.3 Construction of kernel $K$

First, we recall that all results concerning the ${\mathbb{L}}_{p}$ risk of the pointwise selection rule, established in Part I, are proved under the following assumption imposed on the kernel $K$ .

Assumption 6.

There exist $\mathbf{k}_{1}>0$ and $\mathbf{k}_{2}>0$ such that

[TABLE]

Next, we will use the following specific kernel $K$ in the definition of the estimator’s family $\big{\{}\widehat{f}_{\vec{\mathrm{h}}}(\cdot),\;\vec{\mathrm{h}}\in{\cal H}^{d}\big{\}}$ [see, e.g., Kerkyacharian et al. (2001) or Goldenshluger and Lepski (2014)].

Let $\ell$ be an integer number, and let ${\cal K}:{\mathbb{R}}^{1}\to{\mathbb{R}}^{1}$ be a compactly supported continuous function satisfying $\int_{{\mathbb{R}}^{1}}{\cal K}(y){\rm d}y=1$ , and ${\cal K}\in\mathbb{C}({\mathbb{R}}^{1})$ . Put

[TABLE]

and add the following structural condition to Assumption 6.

Assumption 7.

$K(x)=\prod_{j=1}^{d}{\cal K}_{\ell}(x_{j}),\;\forall x\in{\mathbb{R}}^{d}.$ **

The kernel $K$ constructed in this way is bounded, compactly supported, belongs to $\mathbb{C}({\mathbb{R}}^{d})\cap{\mathbb{L}}_{1}({\mathbb{R}}^{d})$ and satisfies $\int_{{\mathbb{R}}^{d}}K=1$ . Some examples of kernels satisfying simultaneously Assumptions 6 and 7 can be found for instance in Comte and Lacour (2013).

2.4 Main results

Introduce the following notations: $\delta_{n}=L(\alpha)n^{-1}\ln(n)$ and

[TABLE]

2.4.1 Bounded case

The first problem we address is the adaptive estimation over the collection of the functional classes $\big{\{}{\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathbb{F}_{g}(R)\cap\mathbb{B}_{\infty,d}(Q)\big{\}}_{\vec{\beta},\vec{r},\vec{L},R,Q}.\;$

As it was conjectured in Lepski and Willer (2017), the boundedness of the function belonging to ${\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathbb{F}_{g}(R)$ is a minimal condition allowing to eliminate the inconsistency zone. The results obtained in Theorem 3 below together with those from Theorem 2 confirm this conjecture.

Theorem 3.

Let $\alpha\in[0,1]$ , $\ell\in{\mathbb{N}}^{*}$ and $g\in{\mathbb{L}}_{1}\big{(}{\mathbb{R}}^{d}\big{)}$ , satisfying Assumption 5, be fixed. Let $K$ satisfy Assumptions 6 and 7.

1) Then for any $p\in(1,\infty)$ , $Q>0$ , $R>0$ , $L_{0}>0$ , $\vec{\beta}\in(0,\ell]^{d}$ , $\vec{r}\in(1,\infty]^{d}$ and $\vec{L}\in[L_{0},\infty)^{d}$ there exists $C<\infty$ , independent of $\vec{L}$ , such that:

[TABLE]

where $\rho(\alpha)$ is defined in (1.17).

2) For any $p\in(1,\infty)$ , $Q>0$ , $R>0,L_{0}>0$ , $\boldsymbol{\beta}\in(0,\ell]$ , $\mathbf{r}\in[1,\infty]$ and $\mathbf{L}\in[L_{0},\infty)$ there exists $C<\infty$ , independent of $\mathbf{L}$ , such that:

[TABLE]

Some remarks are in order. $\mathbf{1^{0}}.\;$ Our estimation procedure is completely data-driven, i.e. independent of $\vec{\beta},\vec{r},\vec{L},R$ , $Q$ , and the assertions of Theorem 3 are completely new if $\alpha\neq 0$ . Comparing the results obtained in Theorems 2 and 3 we can assert that our estimator is optimally-adaptive if $\varkappa_{\alpha}(p)<0$ and nearly optimally adaptive if $0<\varkappa_{\alpha}(p)<p\omega(\alpha)$ . The construction of an estimation procedure which would be optimally-adaptive when $\varkappa_{\alpha}(p)\geq 0$ is an open problem, and we conjecture that the lower bounds for the asymptotics of the minimax risk found in Theorem 2 are sharp in order. This conjecture in the case $\alpha=1$ is partially confirmed by the results obtained in Comte and Lacour (2013) and Rebelles (2016). Since both articles deal with the estimation of unbounded functions we will discuss them in the next section.

It is worth noting that all the previous statements are true not only for the convolution structure density model but also, in view of Theorem 2, for the observation scheme (1.3) as well.

$\mathbf{2^{0}}.\;$ We note that the asymptotic of the minimax risk under partially contaminated observations, $\alpha\in(0,1)$ , is independent of $\alpha$ and coincides with the asymptotic of the risk in the direct observation model, $\alpha=0$ . For the first time this phenomenon was discovered in Hesse (1995) and Yuana and Chenb (2002). In the very recent paper Lepski (2017), in the particular case $\vec{r}=(p,\ldots,p)$ , $p\in(1,\infty)$ the optimally adaptive estimator was built. It is easy to check that independently of the value of $\vec{\beta}$ and $\vec{\mu}$ , the corresponding set of parameters belongs to the dense zone. Note however that our estimator is only optimally-adaptive in this zone, but it is applied to a much more general collection of functional classes. It is worth noting that the estimator procedure, used in Lepski (2017), has nothing in common with our pointwise selection rule.

$\mathbf{3^{0}}.\;$ As to the direct observation scheme, $\alpha=0$ , our results coincide with those obtained recently in Goldenshluger and Lepski (2014), when $p\omega(0)>\varkappa_{0}(p)$ . However, for the tail zone $p\omega(0)\leq\varkappa_{0}(p)$ , our bound is slightly better since the bound obtained in the latter paper contains an additional factor $\ln^{\frac{d}{p}}(n)$ . It is interesting to note that although both estimator constructions are based upon local selections from the family of kernel estimators, the selection rules are different.

$\mathbf{4^{0}}.\;$ Let us finally discuss the results corresponding to the tail zone, $\varkappa_{\alpha}(p)>p\omega(\alpha)$ . First, the lower bound for the minimax risk is given by $[L(\alpha)n^{-1}]^{\rho(\alpha)}$ while the accuracy provided by our estimator is

[TABLE]

As we mentioned above, the passage from $[L(\alpha)n^{-1}]^{\rho(\alpha)}$ to $[L(\alpha)n^{-1}\ln(n)]^{\rho(\alpha)}$ seems to be an unavoidable payment for the application of a local selection scheme. It is interesting to note that the additional factor $\ln^{\frac{d-1}{p}}(n)$ disappears in the dimension $d=1$ . First, note that if $\alpha=0$ the one-dimensional setting was considered in Juditsky and Lambert–Lacroix (2004) and Reynaud–Bouret et al. (2011). The setting of Juditsky and Lambert–Lacroix (2004) corresponds to $r=\infty$ , while Reynaud–Bouret et al. (2011) deal with the case of $p=2$ and $\tau(2)>0$ . Both settings rule out the sparse zone. The rates of convergence found in these papers are easily recovered from our results corresponding to the tail and dense zones.

Next, we remark that the aforementioned factor appears only when anisotropic functional classes are considered. Indeed, in view of the second assertion of Theorem 3 our estimator is nearly optimally adaptive on the tail zone in the isotropic case. The natural question arising in this context, is whether the $\ln^{\frac{d-1}{p}}(n)$ -factor is an unavoidable payment for anisotropy of the underlying function or not?

At last, we note that in the isotropic case our results remain true when the corresponding Nikol’skii class is defined in ${\mathbb{L}}_{1}$ -norm on ${\mathbb{R}}^{d}$ ( $\mathbf{r}=1$ ). It is worth noting that the analysis of the proof of the theorem allows us to assert that if $r_{j}=1$ , $j\in J$ for some $J\neq\{1,\ldots,d\}$ the first statement remains true up to some logarithmic factor. However the asymptotic of the maximal risk of our estimator if $r_{j}=1$ for any $j=1,\ldots,d$ remains unknown.

$\mathbf{5^{0}}.\;$ We finish our discussion with the following remark. If $\alpha\neq 1$ the assumption $f\in\mathbb{F}_{g,\infty}(R,Q)$ implies in many cases that $f$ is uniformly bounded and, therefore, Theorem 3 is applicable. In particular it is always the case if the model (1.3) is considered. Indeed $f,g\in\mathfrak{P}\big{(}{\mathbb{R}}^{d}\big{)}$ in this case, which implies $\|f\|_{\infty}\leq(1-\alpha)^{-1}\|\mathfrak{p}\|_{\infty}\leq(1-\alpha)^{-1}Q$ . Another case is $\|g\|_{\infty}<\infty$ and recall that this assumption was used in the proofs of Theorems 1 and 2, Assumption 3. We obviously have that

[TABLE]

More generally $\|f\|_{\infty}\leq(1-\alpha)^{-1}(Q+\alpha D)$ if $f\in\mathbb{F}_{g,\infty}(R,Q)$ and $\|f\star g\|_{\infty}\leq D$ . Since the definition of the Nikol’skii class implies that $\|f\|_{r^{*}}\leq L^{*}$ , where $r^{*}=\sup_{j=1,\ldots,d}r_{j}$ and $L^{*}=\sup_{j=1,\ldots,d}L_{j}$ , the latter condition can be verified in particular if $\|g\|_{q}<\infty,1/q=1-1/r^{*}$ . All saying above explains why we study the estimation of unbounded functions only in the case $\alpha=1$ .

2.4.2 Unbounded case, $\alpha=1$

The problem we address now is the adaptive estimation over the collection of functional classes $\big{\{}{\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathbb{F}_{g,\infty}(R,Q)\big{\}}_{\vec{\beta},\vec{r},\vec{L},R,Q}.\;$

As we already mentioned, if additionally $\|g\|_{\infty}<0$ then $\mathbb{F}_{g,\infty}(R,Q)=\mathbb{F}_{g}(R)$ for any $Q\geq R\|g\|_{\infty}$ and, therefore, in view of Theorem 1 discussed in Section 1.3, there is no consistent estimator if either $p=1$ or $\varkappa_{\alpha}(p)\leq 0,\;\tau(p)\leq 0,\;p^{*}=p$ . Analyzing the proof of the latter theorem, we come to the following assertion.

Conjecture 1.

Let $\alpha=1$ and assume that Assumption 4 is fulfilled. Suppose additionally that Assumption 2 holds with $\min_{j=1,\ldots,d}\mu_{j}>1/p$ . Then, the assertion of Theorem 1 remains true if one replaces ${\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathbb{F}_{g}(R)$ by ${\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathbb{F}_{g,\infty}(R,Q)$ .

The latter result is formulated as a conjecture only because we will not prove it in the present paper. Its proof is postponed to Part III where the adaptive estimation over the collection

[TABLE]

introduced in Part I will be studied. For this reason, later on we will only consider the parameters $\vec{\beta},\vec{r}$ belonging to the set ${\cal P}_{p,\vec{\mu}}$ defined below.

[TABLE]

For given $p>1$ and $\vec{\mu}\in(0,\infty)^{d}$ the latter set consists of the class parameters for which a uniform consistent estimation is possible.

Theorem 4.

Let $\ell\in{\mathbb{N}}^{*}$ and $g\in{\mathbb{L}}_{1}\big{(}{\mathbb{R}}^{d}\big{)}$ , satisfying Assumption 5 be fixed and let $K$ satisfy Assumptions 6 and 7.

1) Then for any $p>[\min_{j=1,\ldots}\mu_{j}]^{-1}$ , $R,Q>0$ , $0<L_{0}\leq L_{\infty}<\infty$ , $\big{(}\vec{\beta},\vec{r})\in{\cal P}_{p,\vec{\mu}}\cap\big{\{}(0,\ell]^{d}\times(1,\infty]^{d}\big{\}}$ and $\vec{L}\in[L_{0},L_{\infty}]^{d}$ there exists $C<\infty$ , independent of $\vec{L}$ , such that:

[TABLE]

where $\varrho(\cdot)$ is defined in (1.11).

2) For any $p>[\min_{j=1,\ldots}\mu_{j}]^{-1}$ , $R,Q>0$ , $0<L_{0}\leq L_{\infty}<\infty$ , $(\boldsymbol{\beta},\mathbf{r})\in{\cal P}_{p,\vec{\mu}}\cap\big{\{}(0,\ell]\times(1,\infty]\big{\}}$ and $\mathbf{L}\in[L_{0},L_{\infty}]$ there exists $C<\infty$ , independent of $\mathbf{L}$ , such that:

[TABLE]

Some remarks are in order.

$\mathbf{1^{0}}.\;$ Note that $\|g\|_{1}<\infty,\|g\|_{\infty}<\infty$ implies that $\|g\|_{2}<\infty$ and, therefore the Parseval identity together with Assumption 5 allows us to assert that

[TABLE]

Hence, the condition $p>[\min_{j=1,\ldots}\mu_{j}]^{-1}$ is automatically checked if $p\geq 2$ and $\|g\|_{\infty}<\infty$ .

Also, it is worth noting that considering the adaptation over the collection of isotropic classes, we do not require that the coordinates of $\vec{\mu}$ would be the same. The latter is true for the second assertion of Theorem 3 as well. At last, analyzing the proof of the theorem, we can assert that the second assertion remains true under the slightly weaker assumption $p>d(\mu_{1}+\cdots+\mu_{d})^{-1}$ .

$\mathbf{2^{0}}.\;$ The assertion of Theorem 1 has no analogue in the existing literature except the results obtained in Comte and Lacour (2013) and Rebelles (2016). Comte and Lacour (2013) deals with the particular case $p=2$ , $\vec{r}=(2,\ldots,2)$ while Rebelles (2016) studied the case $\vec{r}=(p,\ldots,p)$ , $p\in(1,\infty)$ . It is easy to check that in both papers whatever the value of $\vec{\beta}$ and $\vec{\mu}$ , the corresponding set of parameters belongs to the dense zone. Note also that the estimation procedures used in Comte and Lacour (2013) as well as in Rebelles (2016), if $p\geq 2$ , (both based on a global version of the Goldenshluger-Lepski method) are optimally-adaptive. They attain the asymptotic of minimax risks corresponding to the dense zone found in Theorem 1, while our method is only nearly optimally adaptive. However, it is well-known that the global selection from the family of standard kernel estimators leads to correct results only if $\vec{r}=(p,\ldots,p)$ when the ${\mathbb{L}}_{p}$ -risk is considered, see, for instance Goldenshluger and Lepski (2011). On the other hand, estimation procedures based on a local selection scheme, which can be applied to the estimation of functions belonging to much more general functional classes, often do not lead to an optimally adaptive method. Fortunately, the loss of accuracy inherent to local procedures is logarithmic w.r.t. the number of observations.

$\mathbf{3^{0}}.\;$ Together with Theorems 1 and 2, Theorems 3 and 4 provide the full classification of the asymptotics of the minimax risks over anisotropic/isotropic Nikolskii classes for the class parameters belonging to the sparse zone and, up to some logarithmic factor, belonging to the tail and dense zones as well as the boundaries. We mean that the results of these theorems are valid for any fixed $\vec{\beta}\in(0,\infty)^{d},\vec{r}\in(1,\infty]^{d}$ and $\vec{L}\in(0,\infty)^{d}$ . Indeed, for given $\vec{\beta}$ and $\vec{L}$ one can choose $L_{0}=\min_{j=1,\ldots d}L_{j}$ , $L_{\infty}=\max_{j=1,\ldots d}L_{j}$ and the number $\ell$ , used in the kernel construction (2.6), as any integer strictly larger than $\max_{j=1,\ldots d}\beta_{j}$ .

2.4.3 Open problems

Let us briefly discuss some unresolved adaptive estimation problems in the convolution structure density model.

Construction of an optimally-adaptive estimator

As we already mentioned the proposed pointwise selection rule leads to an optimal adaptive estimator only for the class parameters belonging to the sparse zone (in both bounded and unbounded case). We conjecture that the construction of an optimally-adaptive estimator for all values of the nuisance parameters via pointwise selection is impossible, and other methods should be invented. It is worth noting that no optimally-adaptive estimator is known neither in the density model nor in the density deconvolution even in dimension 1. In dimension larger than 1, one of the intriguing questions is related to the eventual price to pay for anisotropy ( $\ln^{\frac{d-1}{p}}(n)$ -factor) discussed in the remark $\mathbf{4^{0}}$ after Theorem 3.

Adaptive estimation of unbounded functions

We were able to study the unbounded case only if $\alpha=1$ . The estimation of unbounded densities under direct as well as partially contaminated observations remain open problems. We conjecture that the results obtained in the case $\alpha=1$ are not true anymore for $\alpha\neq 1$ (neither upper bounds nor lower bound), but correct (or nearly correct) upper bounds for the asymptotics of the minimax risk can still be deduced from the oracle inequalities proved in Part I.

In the case $\alpha=1$ there are at least two interesting problems. First, all our results are valid under the condition $p>[\min_{j=1,\ldots}\mu_{j}]^{-1}$ . How the absence of this assumption may have effects on the accuracy of estimation is absolutely unclear. Next, let us mention that the lower bound result proved in Theorem 1 holds only under the consideration of the convolution structure density model. Could the same bounds be established in the deconvolution model (1.3)?

Adjustment of ”lower” and ”upper bound” assumptions to each other

Comparing the assertions of Theorems 1 and 2 with those of Theorem 3 and 4, we remark that the obtention of the corresponding lower bounds for the minimax risk requires additional, rather restrictive, assumptions on the function $g$ . Can they be weakened or even removed?

3 Proof of Theorems 3 and 4

The proofs are based on the application of Theorem 3 from Part I and on some auxiliary assertions presented below.

In the subsequent proof $\mathbf{c},\mathbf{c}_{1},\mathbf{c}_{2},C,C_{1},C_{2}\ldots$ , stand for constants that can depend on $g,L_{0},L_{\infty}$ , $Q,R$ , $\vec{\beta}$ , $\vec{r}$ , $d$ and $p$ , but are independent of $\vec{L}$ and $n$ . These constants can be different on different appearances.

3.1 Important concepts from Part I and proof outline

In this section we recall the definition of some important quantities that appeared in Theorem 3 of Part I and discuss the facts which should be established to make this theorem applicable.

$\mathbf{I^{0}.\;}$ Theorem 3 (Part I) deals with the minimax result over a class $\mathbb{F}$ being an arbitrary subset of $\mathbb{F}_{g,\mathbf{u}}(R,D)\cap\mathbb{B}_{\mathbf{q},d}(D)$ defined in Section 2.3 of Part I. In Theorem 3 we will consider $\mathbb{F}={\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathbb{B}_{\mathbf{\infty},d}(Q)$ and, therefore, $\mathbb{F}\subset\mathbb{F}_{g,\mathbf{\infty}}(R,D)\cap\mathbb{B}_{\mathbf{\infty},d}(Q)$ with $D=Q[1-\alpha+\alpha\|g\|_{1}]$ . This makes Theorem 3 (Part I) with $\mathbf{u}=\infty$ applicable in this case.

In Theorem 4 we consider $\mathbb{F}={\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathbb{F}_{g,\mathbf{\infty}}(Q)$ . We will show that for any $\vec{\beta},\vec{r}$ and $\vec{L}$ one can find $\mathbf{q}>1$ and $D>0$ such that ${\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\subset\mathbb{B}_{\mathbf{q},d}(D)$ and, therefore, Theorem 3 (Part I) is applicable with $\mathbf{u}=\infty$ . The latter inclusions are mostly based on the embedding of anisotropic Nikol’skii spaces used in the proof of Proposition 3 and on Lemma 1.

$\mathbf{II^{0}.\;}$ The application of Theorem 3 (Part I) in the case $\mathbf{u}=\infty$ requires to compute

[TABLE]

where remind $F_{n}\big{(}\vec{h}\big{)}=\big{(}\ln{n}+\sum_{j=1}^{d}|\ln{h_{j}|}\big{)}^{1/2}\prod_{j=1}^{d}(nh)^{-\frac{1}{2}}_{j}(h_{j}\wedge 1)^{-\boldsymbol{\mu}_{j}(\alpha)}$ and $\mathbf{c}>0$ is a universal constant completely determined by the kernel ${\cal K}_{\ell}$ and the dimension $d$ .

In the next section we propose quite sophisticated constructions of vectors $\boldsymbol{h}(\cdot,\mathbf{s})$ and $\vec{\mathfrak{h}}(\cdot,\mathbf{s})$ , $\mathbf{s}\in[1,\infty]$ and show, Propositions 1 and 2, that

[TABLE]

Here $\boldsymbol{v}$ is defined in (3.19), $\underline{\mathbf{v}},\mathbf{v}$ are defined in (3.20) and $\overline{\boldsymbol{v}}\in\{1,\mathbf{v_{1}},\mathbf{v_{3}},\overline{\mathbf{v}},\overline{\mathbf{v}}\wedge\mathbf{v_{3}}\}$ , where $\mathbf{v_{1}},\mathbf{v_{3}}$ are defined in (3.22) and $\overline{\mathbf{v}}$ is given in (3.23). In Proposition 3 we prove that for any $\vec{h}\in{\cal H}^{d}$

[TABLE]

and if $\tau(p^{*})>0$ then additionally

[TABLE]

where $\vec{\gamma}$ and $\vec{q}$ are defined in (3.16) below and $C_{1}$ is independent of $\vec{L}$ . At last the definition of $\boldsymbol{h}(\cdot,\mathbf{s})$ and $\vec{\mathfrak{h}}(\cdot,\mathbf{s})$ , $\mathbf{s}\in[1,\infty]$ together with (3.2) allows us to assert, see (3.31), that

[TABLE]

where ${\cal J}_{\infty}=\{j=1,\ldots,d:\;r_{j}=\infty\}$ . Thus, putting

[TABLE]

we obtain in view of (3.1), (3.2) and (3.4) that

[TABLE]

To get (3.6) we have used that for all $n$ large enough and all $v\in[\underline{\mathbf{v}},1]$

$F_{n}\big{(}\vec{\boldsymbol{h}}(v,\mathbf{1}))\big{)}\leq C_{2}(\ln{n}/n)\prod_{j=1}^{d}(\boldsymbol{h}_{j}(v,\mathbf{1}))^{-1-2\boldsymbol{\mu}_{j}(\alpha)},$

where $C_{2}$ is independent of $\vec{L}$ . This follows from assertions (4.1) and (4.3) established in the proof of Proposition 1. We deduce from (3.5) and (3.6), the following bound.

[TABLE]

Moreover, if $\tau(p^{*})>0$ we get in view of (3.1), (3.3) and (3.4)

[TABLE]

3.2 Special set of bandwidths

The bandwidth’s construction presented below as well as auxiliary statements from the next section will be exploited not only for proving Theorems 3 and 4, but also in the consideration forming Part III of this work. By this reason we formulate them in a bit more general form than what is needed for our current purposes. Set for any $r,\mathbf{s}\in[1,\infty]$

$\varkappa_{\alpha}(r,\mathbf{s})=\frac{\mathbf{s}\omega(\alpha)(2+1/\beta(\alpha))}{(\mathbf{s}+\omega(\alpha))}-r,\quad\alpha\in[0,1].$

Recall that $\mathbf{c}=\big{(}20d\big{)}^{-1}\big{[}\max(2c_{{\cal K}_{\ell}}\|{\cal K}_{\ell}\|_{\infty},\|{\cal K}_{\ell}\|_{1})\big{]}^{-d}$ and let $\boldsymbol{L}>0$ be any number satisfying (recall that $C_{1}$ appeared in (3.2))

[TABLE]

Recall that $\delta_{n}=L(\alpha)n^{-1}\ln{n}$ and introduce for any $v>0$ , $\mathbf{s}\in[1,\infty]$ and $j=1,\ldots,d$

[TABLE]

where we have put $p_{\pm}=[\sup_{j\in\bar{{\cal J}}_{\infty}}r_{j}]\vee p$ , $\bar{{\cal J}}_{\infty}$ is complimentary to ${\cal J}_{\infty}$ and

[TABLE]

The constant $\mathfrak{a}>0$ will be chosen differently in accordance with some special relationships between the parameters $\vec{\beta}$ , $\vec{r}$ , $\vec{\mu}$ , $\alpha$ and $p$ . Determine $\boldsymbol{h}_{j}(\cdot,\mathbf{s})$ and $\mathfrak{h}_{j}(\cdot,\mathbf{s}),j=1,\ldots,d$ , from the relations

[TABLE]

and set $\vec{\boldsymbol{h}}(\cdot,\mathbf{s})=\big{(}\boldsymbol{h}_{1}(\cdot,\mathbf{s}),\ldots,\boldsymbol{h}_{d}(\cdot,\mathbf{s})\big{)}$ and $\vec{\mathfrak{h}}(\cdot,\mathbf{s})=\big{(}\mathfrak{h}_{1}(\cdot,\mathbf{s}),\ldots,\mathfrak{h}_{d}(\cdot,\mathbf{s})\big{)}$ .

3.3 Auxiliary statements

All the results formulated below are proved in Section 4. Let

$\mathfrak{z}(v)=2\big{(}\mathfrak{a}^{-2}\delta_{n}\big{)}^{-\frac{\omega(\alpha)}{\omega(\alpha)+\mathbf{u}}}v^{\frac{\omega(\alpha)(2+1/\beta(\alpha))}{\mathbf{u}+\omega(\alpha)}},\quad\mathbf{u}\in[1,\infty],$

and remark that $\mathfrak{z}(\cdot)\equiv 2$ if $\mathbf{u}=\infty$ . Note also that

[TABLE]

Introduce the following notations: $\mu(\alpha)=\min_{j=1,\ldots,d}\mu_{j}(\alpha)$ ,

[TABLE]

Recall that $z(\alpha)=\omega(\alpha)(2+1/\beta(\alpha))\beta(0)\tau(\infty)+1$ and define

[TABLE]

Set $\mathbf{u}^{*}=[-\tau(\infty)\beta(0)]^{-1}$ if $\tau(\infty)<0$ and let $\mathbf{u}^{*}=\infty$ if $\tau(\infty)\geq 0$ . Put finally $\mathbf{y}=\mathbf{u}^{*}\vee p^{*}$ .

Proposition 1.

Let $\vec{\beta}$ , $\vec{r}$ , $L_{0},L_{\infty}$ , $\vec{\mu}$ , $\alpha$ and $p$ be given. Assume that $\vec{L}\in[L_{0},L_{\infty}]^{d}$ . Then,

1) there exists $\mathfrak{a}>0$ independent of $\vec{L}$ such that for all $n$ large enough

[TABLE]

2) there exists $\mathfrak{a}>0$ independent of $\vec{L}$ and $\mathbf{u}$ such that for all $n$ large enough

[TABLE]

if either $\varkappa_{\alpha}(p^{*},\mathbf{u})<0,\tau(\infty)\geq 0$ or $\varkappa_{\alpha}(p^{*},\mathbf{u})<0,$ $\tau(p^{*})>0$ , $Y\geq[X+1]\mathbf{y}^{-1}-1/\mathbf{u}$ .

Remark 2.

Note that if $\alpha\neq 1$ , the condition $Y\geq[X+1]\mathbf{y}^{-1}-1/\mathbf{u}$ simply means $\mathbf{u}\leq\mathbf{u^{*}}\vee p^{*}$ , since $X=Y=0$ . On the other hand if $\alpha=1$ this condition holds if $\tau(\infty)\geq 0$ whatever the values of $\vec{\beta},\vec{\mu}$ and $\vec{r}$ , since $Y>0$ . Also, note that

[TABLE]

Indeed, since $r_{j}\leq p^{*}\leq\mathbf{y}$ for any $j=1,\ldots,d$ we have

[TABLE]

and (3.21) follows. To get the last inequality we have used that $\tau(\mathbf{u}^{*})=0$ and that $\tau(\cdot)$ is strictly decreasing, so $\tau(\mathbf{y})\leq 0$ . In particular we deduce from (3.21) that the condition $Y>[X+1]\mathbf{y}^{-1}-1/\mathbf{u}$ is always fulfilled in the case $\mathbf{u}=\mathbf{u}^{*}$ .

Recall that $\boldsymbol{v}\to 0,n\to\infty,$ is defined in (3.19) and introduce the following quantities.

[TABLE]

where $\pi(\mathbf{u)}=[1/\omega(0)-1/\mathbf{u}][1+X]-1/\beta(0)[Y+1/\mathbf{u}].\;$ Define also

[TABLE]

Note that $\mathbf{v_{1}}\to\infty,n\to\infty$ , if $\infty>\mathbf{u}\geq\mathbf{u}^{*}\vee p^{*}$ (it will be proved in Proposition 2 below). However $\mathbf{v_{1}}=1$ if $\mathbf{u}=\infty$ . As it is shown in the proof of Proposition 1, formulae (4.11), $\boldsymbol{v}<\mathbf{v}$ for all $n$ large enough. Also $\mathbf{v_{2}}\to\infty,n\to\infty$ , if $\varkappa_{1}(p^{*},\mathbf{u})<0$ . At last $\mathbf{v_{3}}\to\infty,n\to\infty$ , since $\omega(0)>\omega(1)$ . Moreover $\mathbf{v_{3}}=\infty$ if $\pi(\mathbf{u})\leq 0$ . Introduce finally

[TABLE]

Proposition 2.

Let $\vec{\beta}$ , $\vec{r}$ , $L_{0},L_{\infty}$ , $\vec{\mu}$ , $\alpha$ and $p$ be given and let $\vec{L}\in[L_{0},L_{\infty}]^{d}$ , $\mathbf{u}\in[\mathbf{u}^{*}\vee p^{*},\infty]$ . Then, there exists $\mathfrak{a}>0$ independent of $\vec{L}$ and $\mathbf{u}$ such that for all $n$ large enough

[TABLE]

In the current paper we will use the statements of Proposition 1 and 2 only with $\mathbf{u}=\infty$ . In this context we remark that $\varkappa_{\alpha}(\cdot)\equiv\varkappa_{\alpha}(\cdot,\mathbf{\infty})$ .

Proposition 3.

Let $\ell\in{\mathbb{N}}^{*}$ , $p>1$ and $K$ satisfying Assumption 7 be fixed. Then for any $\vec{\beta}\in(0,\ell]^{d}$ , $\vec{r}\in[1,\infty]^{d}$ and $\vec{L}\in(0,\infty)^{d}$ one can find $C_{1}>0$ independent of $\vec{L}$ such that (3.2) holds. If additionally $\tau(p^{*})>0$ then (3.3) is fulfilled as well. At last, (3.2) and (3.3) remain true if one replaces the quantity $\mathbf{B}$ by $\mathbf{B}^{*}$ .

The quantities $\mathbf{B}_{j,s,\mathbb{F}}(\cdot)$ and $\mathbf{B}^{*}_{j,s,\mathbb{F}}(\cdot)$ are introduced in Part I but the reader can find them in the proof of the proposition. Let us also present the following auxiliary results which will be useful in the sequel. Their proofs are postponed to Appendix.

Lemma 1.

For any $\mathbf{u}\in[1,\infty]$

[TABLE]

Let $Y-[X+1]\mathbf{y}^{-1}>0$ and $\varkappa_{1}(p^{*},\mathbf{\infty})\geq 0$ . Then there exists $s>p^{*}$ such that

[TABLE]

We finish this section with the following observations which will be useful in the sequel.

If $\varkappa_{\alpha}(p^{*})\geq 0$ one has

[TABLE]

If $\varkappa_{\alpha}(p^{*})<0$ one has

[TABLE]

3.4 Concluding remarks

Let us collect some bounds for several terms appearing in Theorem 3 (Part I) and used in the proofs of Theorems 3 and 4 simultaneously.

$\mathbf{1^{0}.\;}$ First we remark that $\boldsymbol{h}_{j}(\cdot,\mathbf{1})\equiv\boldsymbol{h}_{j}(\cdot,\mathbf{\infty})\equiv\mathfrak{h}_{j}(\cdot,\mathbf{\infty})\leq\big{(}\boldsymbol{L}L_{j}^{-1}\big{)}^{\frac{1}{\beta_{j}}}$ , $j\in{\cal J}_{\infty}$ . Then, (3.4) follows from (3.2) and (3.9) because for any $j\in{\cal J}_{\infty}$ and $v>0$

[TABLE]

$\mathbf{2^{0}.\;}$ We deduce from the definition of $\vec{\boldsymbol{h}}(\cdot,\mathbf{s}),\mathbf{s}\in\{1,\infty\}$ that

[TABLE]

It yields together with (3.7) and the definitions of $\underline{\mathbf{v}}$ and $\boldsymbol{v}$ , choosing $\underline{\boldsymbol{v}}=\underline{\mathbf{v}}$ ,

[TABLE]

After elementary computations and taking into account (3.28), we obtain

[TABLE]

These bounds are not surprising because $\varrho(\alpha)=\rho(\alpha)$ if $\varkappa_{\alpha}(p)\geq 0$ . At last, if $\tau(p^{*})>0$ , we get from (3.8) thanks to the definition of $\vec{\mathfrak{h}}(\cdot,\mathbf{\infty})$ and the presentation proved in (4.6) with $\mathbf{u}=\infty$

[TABLE]

$\mathbf{3^{0}.\;}$ At last, choosing $\underline{\boldsymbol{v}}=\underline{\mathbf{v}}$ , we obtain $\ell_{\mathbb{H}}(\underline{\boldsymbol{v}})\leq c_{6}\delta_{n}^{\frac{p-1}{1-1/\omega(\alpha)+1/\beta(\alpha)}}\big{(}\ln{n}\big{)}^{t(\mathbb{H})}$ , which yields by (3.28), (3.29) and (3.30):

[TABLE]

3.5 Proof of Theorem 3

As it has already been mentioned we will apply Theorem 3 (Part I) with $\mathbf{u}=\infty$ , $\mathbf{q}=\infty$ , $D=Q[1-\alpha+\alpha\|g\|_{1}]\vee Q$ and $\underline{\boldsymbol{v}}=\underline{\mathbf{v}}$ .

$\mathbf{1^{0}.\;}$ Consider the cases $\varkappa_{\alpha}(p^{*})\geq 0,$ or $\varkappa_{\alpha}(p^{*})<0,\tau(\infty)\leq 0$ .

Choose $\overline{\boldsymbol{v}}=1$ and remark that the statements of Propositions 1 and 2 hold for any $v\in[\underline{\boldsymbol{v}},\overline{\boldsymbol{v}}]$ . Indeed, it suffices to note that ${\cal I}_{\mathbf{\infty}}(\alpha)\supseteq[\underline{\boldsymbol{v}},\overline{\boldsymbol{v}}]:=[\underline{\mathbf{v}},1]$ , because $\mathbf{v_{1}},\mathbf{v_{2}},\mathbf{v_{3}}>1$ and $\overline{\mathbf{v}}\geq 1$ if $\tau(\infty)<0$ since in this case $\mathbf{v}>1$ by (3.25). Then we can apply all the bounds obtained above, and in particular we get from (3.5)

[TABLE]

since $\omega(\alpha)\geq p\rho(\alpha)$ in both considered cases in view of the second equality in (3.28) and of (3.30). Applying the third assertion of Theorem 3 (Part I), we obtain from (3.4), (3.33), (3.36) and (3.35)

$\displaystyle{\sup_{f\in{\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\cap\mathbb{F}_{g}(R)}}{\cal R}^{(p)}_{n}[\widehat{f}_{\vec{\mathbf{h}}(\cdot)},f]\leq C\bigg{[}(c_{2}+c_{3}+c_{5}+c_{6})\mathfrak{b}^{p}_{n}(\mathbb{H})\delta_{n}^{p\rho(\alpha)}\bigg{]}^{\frac{1}{p}}\leq c_{7}\mathfrak{b}_{n}(\mathbb{H})\delta_{n}^{\rho(\alpha)},$

and the assertion of Theorem 3 follows in both considered cases.

$\mathbf{2^{0}.\;}$ Consider the case $\varkappa_{\alpha}(p^{*})<0,\tau(\infty)>0$ .

Choose $\overline{\boldsymbol{v}}=\mathbf{v}$ and remark that the statements of Propositions 1 and 2 hold hold for any $v\in[\underline{\boldsymbol{v}},\overline{\boldsymbol{v}}]$ . Indeed, $\tau(\infty)>0$ implies $\mathbf{v}<1$ and, therefore, $\overline{\mathbf{v}}=\overline{\mathbf{v}}\wedge\mathbf{v_{3}}=\mathbf{v}$ . We deduce from (3.4), (3.33), (3.34) and (3.35), applying the first assertion of Theorem 3 (Part I) that

[TABLE]

Here we have also used (3.30). This completes the proof of Theorem 3.

3.6 Proof of Theorem 4

In the following we assume $p^{*}<\infty$ , since $p^{*}=\infty$ implies by definition of the anisotropic Nikol’skii class that ${\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\subset\mathbb{B}_{\infty,d}(L_{\infty})$ . Hence, the results in that case follow from Theorem 3 since $\varrho(\alpha)=\rho(\alpha)$ when $p^{*}=\infty$ .

Moreover, we remark that the imposed condition $p>[\min_{j=1,\ldots}\mu_{j}]^{-1}$ implies $Y\geq[X+1]\mathbf{y}^{-1}-1/\mathbf{u}$ in view of (3.21) proved in Remark 2. This, first, makes the second assertion of Proposition 1 applicable.

Next, it allows (recall that $p^{*}<\infty$ and $\alpha=1$ ) to rewrite ${\cal I}_{\infty}(1)$ appeared in Proposition 2 as

${\cal I}_{\infty}(1)=[\boldsymbol{v},\mathbf{v_{3}}]\mathrm{1}_{\{\varkappa_{1}(p^{*})\geq 0\}}+[\boldsymbol{v},\overline{\mathbf{v}}]\mathrm{1}_{\{\varkappa_{1}(p^{*})<0\}}.$

$\mathbf{1^{0}.\;}$ Consider the case $\varkappa_{\alpha}(p^{*})<0,\tau(p^{*})>0$ .

Taking into account that $\vec{L}\in[L_{0},L_{\infty}]$ we remark that in view of Nikol’skii (1977) [Theorem 6.9.1, Section 6.9] ${\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\subset\mathbb{B}_{p^{*},d}(c_{9}L_{\infty})$ , where $c_{9}$ is independent of $\vec{L}$ . Thus, Theorem 3 (Part I) is applicable with $\mathbf{u=\infty}$ , $\mathbf{q}=p^{*}$ and $D=c_{9}L_{\infty}\vee Q$ . Choose $\overline{\boldsymbol{v}}=\mathbf{v}$ and remark that the statements of Propositions 1 and 2 hold since $\overline{\mathbf{v}}=\mathbf{v}$ . The assertion of the theorem is obtained from (3.4), (3.33), (3.34), (3.35), (3.29) and the first assertion of Theorem 3 (Part I) by the same computations that led to (3.37).

$\mathbf{2^{0}.\;}$ Consider the case $\varkappa_{1}(p^{*})<0,\tau(p^{*})\leq 0$ . Recall that $p^{*}>p$ in this case because it is necessary for the existence of an uniformly consistent estimator. Since the definition of the anisotropic Nikol’skii class implies that ${\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\subset\mathbb{B}_{p^{*},d}(L_{\infty})$ , we assert that the second assertion of Theorem 3 (Part I) is applicable with $\mathbf{u}=\infty$ , $\mathbf{q}=p^{*}$ and $D=L_{\infty}\vee Q$ . Choose $\overline{\boldsymbol{v}}=\mathbf{v_{2}}$ and note that $\overline{\mathbf{v}}=\mathbf{v_{2}}$ in the considered case. Thus, we deduce from (3.4), (3.33), (3.35) and (3.29)

[TABLE]

and the assertion of the theorem follows in this case.

$\mathbf{3^{0}.\;}$ It remains to study the case $\varkappa_{1}(p^{*})\geq 0$ . Let $s$ be an arbitrary number satisfying (3.27) of Lemma 1. Since $\tau(s)>0$ and $s>p^{*}$ we can assert in view of Nikol’skii (1977) [Theorem 6.9.1, Section 6.9] ${\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}\subset\mathbb{B}_{s,d}(c_{9}L_{\infty})$ , where $c_{9}$ is independent of $\vec{L}$ . Thus, Theorem 3 (Part) is applicable with $\mathbf{u=\infty}$ , $\mathbf{q}=s$ and $D=c_{10}L_{\infty}\vee Q$ . Choosing $\overline{\boldsymbol{v}}=\mathbf{v_{3}}$ , we deduce from (3.4), (3.33), (3.35) and from the second assertion of Theorem 3 (Part 2)

[TABLE]

Since either $p^{*}/\omega(0)=1/\beta(0)$ , $\varkappa_{1}(p^{*})>0$ or $p^{*}/\omega(0)>1/\beta(0)$ , $\varkappa_{1}(p^{*})\geq 0$ and $s>p^{*}\geq p$ we get

[TABLE]

Simple algebra shows that

[TABLE]

Using again $\varkappa_{1}(p^{*})\geq 0$ and $p^{*}\geq p$ we obtain

[TABLE]

since $s$ satisfies (3.27) of Lemma 1. Thus, we have for all $n$ large enough

$\delta_{n}^{\frac{Y(s-p)}{[1+X]/\omega(0)-Y/\beta(0)}}\leq\delta_{n}^{\frac{p}{2+1/\beta(1)}}\leq\mathfrak{b}^{p}_{n}(\mathbb{H})\delta_{n}^{p\varrho(\alpha)}$

and the assertion of the theorem in the case $\varkappa_{1}(p^{*})\geq 0$ follows from (3.38) and the first equality in (3.28). Theorem 4 is proved.

4 Proofs of Propositions 1, 2 and 3

The proof of Lemma 2 is postponed to Appendix.

Lemma 2.

For any $\vec{\beta}$ , $\vec{r}$ , $\vec{\mu}$ , $p\geq 1$ and $\alpha\in[0,1]$ the following is true.

$1/\gamma(\alpha)-1/\beta(\alpha)=\big{[}\tau(\infty)\beta(0)\big{]}^{-1}\big{[}1/\omega(\alpha)-1/\upsilon(\alpha)\big{]}.$

4.1 Proof of Proposition 1

We start the proof with several remarks which will be useful in the sequel. First, obviously there exists $0<\mathbf{T}:=T\big{(}\vec{\beta},\vec{r},\vec{\mu},p\big{)}<\infty$ independent of $\vec{L}$ such that

[TABLE]

Next, for any $\mathbf{s}\in[1,\infty]$ and any $v>0$

[TABLE]

Let us proceed to the proof of the first assertion. First we remark that for all $n\geq 3$

[TABLE]

Indeed for any $v>0$ we have since $\boldsymbol{L}\leq L_{0}$ ,

[TABLE]

Therefore, for any $v\in[\underline{\mathbf{v}},1]$ one has in view of the definition of $\underline{\mathbf{v}}$

[TABLE]

Note that for any $j\in{\cal J}_{\infty}$

[TABLE]

and the proof of (4.3) is completed since $\boldsymbol{h}_{j}(\cdot,\mathbf{1})\leq\widetilde{\boldsymbol{\eta}}_{j}(\cdot,\mathbf{1})$ by construction.

Set $T_{0}=\big{[}\mathbf{T}+2\big{]}\;e^{d+2\sum_{j=1}^{d}\mu_{j}(\alpha)}\boldsymbol{L}^{-\frac{1}{\beta(\alpha)}}$ and remark that in view of (4.1), (4.2) and (4.3) for all $n$ large enough and any $v\in[\underline{\mathbf{v}},1]$

[TABLE]

Here we have taken into account that $\boldsymbol{h}_{j}(v,\mathbf{s})\geq e^{-1}\boldsymbol{\eta}_{j}(v,\mathbf{s})$ . Since

[TABLE]

denoting $\mathfrak{a}=\sqrt{a/T_{0}}$ we assert that

[TABLE]

The first assertion is established.

Before proving the second assertion, let us make several remarks.

$\mathbf{1^{0}.}\;$ For any $\mathbf{u}\in[1,\infty]$ the following is true.

[TABLE]

The first equality follows directly from the definition of $\widehat{\boldsymbol{\eta}}_{j}(\mathbf{v},\mathbf{u})$ since, remind $\gamma_{j}=\beta_{j},q_{j}=\infty$ if $j\in{\cal J}_{\infty}$ . Thus, let us prove the second equality. We have

[TABLE]

Here we have used that $q_{j}=p_{\pm}$ for any $j\in\bar{{\cal J}}_{\infty}$ . Using the definition of $\mathbf{v}$ we get

[TABLE]

Using the definition of $z(\alpha)$ we obtain

[TABLE]

We obtain applying Lemma 2

[TABLE]

The second formula in (4.6) is established.

$\mathbf{2^{0}.}\;$ Next, let us prove that

[TABLE]

If ${\cal J}_{\infty}\neq\emptyset$ , which is equivalent to $p^{*}=\infty$ , the definition of $\mathbf{v}$ implies that $\mathbf{v}\leq 1$ for all $n$ large enough, since $\tau(p^{*})=\tau(\infty)>0$ and in view of (3.25). We deduce from the first equality in (4.6)

[TABLE]

and (4.7) is proved for any $j\in{\cal J}_{\infty}$ .

It remains to note that $\tau(p_{\pm})\geq\tau(p^{*})$ since $p^{*}\geq p_{\pm}$ and therefore, if $\tau(p^{*})\geq 0$ we have

[TABLE]

for all $n$ large enough in view of (3.25) of Lemma 1, the second equality in (4.6) and since $\boldsymbol{L}L_{j}^{-1}\leq 1$ . This completes the proof of (4.7).

$\mathbf{3^{0}.}\;$ For any $\mathbf{u}\in[1,\infty]$ one has

[TABLE]

where we have denoted $T\big{(}\alpha\big{)}=\inf_{\vec{L}\in[L_{0},L_{\infty}]^{d}}\prod_{j\in{\cal J}_{\infty}}\big{(}\boldsymbol{L}L_{j}^{-1}\big{)}^{\frac{1+2\boldsymbol{\mu}_{j}(\alpha)}{\beta_{j}}}\prod_{j\in\bar{{\cal J}}_{\infty}}\big{(}\boldsymbol{L}L_{j}^{-1}\big{)}^{\frac{1+2\boldsymbol{\mu}_{j}(\alpha)}{\gamma_{j}}}$ .

Indeed, we have in view of (4.6) and the definition of $\mathbf{v}$

[TABLE]

where we have put $\frac{1}{\beta_{\infty}(\alpha)}=\sum_{j\in{\cal J}_{\infty}}\frac{1+2\boldsymbol{\mu}(\alpha)}{\beta_{j}},\;\frac{1}{\gamma_{\pm}(\alpha)}=\sum_{j\in\bar{{\cal J}}_{\infty}}\frac{1+2\boldsymbol{\mu}(\alpha)}{\gamma_{j}}.$ Note that for any $\alpha\in[0,1]$

[TABLE]

and (4.8) and (4.9) are established.

$\mathbf{4^{0}.}\;$ Simple algebra shows that for any $\mathbf{u}\in[1,\infty]$

[TABLE]

and we deduce from (4.8) for any $\mathbf{u}\in[1,\infty]$ (recall that $\mathfrak{z}\equiv 2$ if $\mathbf{u}=\infty$ )

[TABLE]

Let us also prove that for any $\mathbf{u}\in[1,\infty]$ and all $n$ large enough

[TABLE]

The latter inclusion follows from (3.19). Indeed, if $\tau(\infty)\leq 0$ then $\mathbf{v}\geq 1\geq\boldsymbol{v}$ . If $\tau(\infty)>0$

[TABLE]

in view of (3.25), so $\mathbf{v}>\boldsymbol{v}$ . Note at last that for any $\mathbf{u}\in[1,\infty]$

[TABLE]

$\mathbf{5^{0}.}\;$ Let us proceed to the proof of the second assertion. Let us choose $\mathfrak{a}<aT(\alpha)/(4T_{0})<1$ . We have in view of (4.1), (4.8) and (4.10) similarly to (4.5)

[TABLE]

Thus to prove the assertion all we need to show is that $\vec{\mathfrak{h}}(\mathbf{v},\mathbf{u})\in\mathfrak{H}(\mathbf{v})$ , i.e. $G_{n}\big{(}\vec{\mathfrak{h}}(\mathbf{v},\mathbf{u})\big{)}\leq a\mathbf{v}$ . Let us distinguish three cases.

$\mathbf{5^{0}a.}\;$ Let $\tau(\infty)\geq 0.$ We remark that the definition of $\mathbf{v}$ in this case yields $\mathbf{v}\leq 1$ for all $n$ large enough and we obtain from (4.10) and (4.11) that

[TABLE]

Then we have in view of (4.1), (4.7), (4.8) and (4.14) similarly to (4.5)

[TABLE]

$\mathbf{5^{0}b.}\;$ Let $\tau(\infty)<0,\tau(p^{*})>0$ and $\alpha\neq 1$ . Then by assumption $\mathbf{u}\leq p$ , and thus $\tau(\mathbf{u})\geq 0$ . We get from (4.10) and (4.12)

[TABLE]

so $G_{n}\big{(}\vec{\mathfrak{h}}(\mathbf{v},\mathbf{u})\big{)}\leq a\mathbf{v}$ follows from (4.15) and (4.13).

$\mathbf{5^{0}c.}\;$ Let $\tau(\infty)<0,\tau(p^{*})>0$ , $\alpha=1$ . We have as previously

[TABLE]

Here we have used (4.10) and put $T_{1}=T^{-1}(0)\boldsymbol{L}^{-1/\beta(0)}$ . Our goal now is to show that for any $\mathbf{u}\in[1,\infty]$ and all $n$ large enough

[TABLE]

In view of (4.9) and of the definition of $\mathfrak{z}(\cdot)$ in order to establish (4.18) it suffices to show that $z(1)/\omega(1)-1+2/\mathbf{u}\geq 0$ .

Since we assumed $\tau(\infty)<0$ and $\tau(p^{*})>0$ , then necessarily $\mathbf{u}^{*}>p^{*}$ since $\tau(\mathbf{u}^{*})=0$ and $\tau(\cdot)$ is strictly decreasing. Hence, the required results follows from (3.26). Thus, (4.18) is proved. Then choosing $\mathfrak{a}$ such that $T_{0}(2T^{-1}(1)T_{1})^{1/2}\mathfrak{a}^{2}\leq a$ , we obtain from (4.17) and (4.18) that for all all $n$ large enough

[TABLE]

The second assertion is proved

4.2 Proof of Proposition 2

We start the proof with several remarks which will be useful in the sequel.

$\mathbf{1^{0}.}\;$ Let us show that for all $n$ large enough

[TABLE]

In view of the definition of $\widetilde{\boldsymbol{\eta}}_{j}(\cdot,\mathbf{u}),j=1,\ldots,d$ ,

[TABLE]

Therefore, for any $v\in[\mathbf{v}_{0},1]$ one has, taking into account that $\boldsymbol{L}\leq L_{0}$ ,

[TABLE]

It remains to note that $\boldsymbol{v}>\mathbf{v}_{0}$ for all $n$ large enough and, therefore,

[TABLE]

We also have in view of the definition of $\widetilde{\boldsymbol{\eta}}_{j}(\cdot,\mathbf{u}),j=1,\ldots,d$ ,

[TABLE]

for any $v\leq 1$ . This together with (4.21) proves (4.19) in the cases when ${\cal I}_{\mathbf{u}}(\alpha)=[\boldsymbol{v},1]$ .

Noting that $p^{*}<\infty$ is equivalent to ${\cal J}_{\infty}=\emptyset$ , we deduce from (4.20) for any $v\geq 1$

[TABLE]

Thus, if $\varkappa_{\alpha}(p^{*},\mathbf{u})\geq 0$ then for any $v\geq 1$

[TABLE]

This together with (4.21) yields (4.19) in the case $\varkappa_{\alpha}(p^{*},\mathbf{u})\geq 0,p^{*}<\infty$ , whatever the value of $\alpha$ .

Let $\alpha=1,p^{*}<\infty,\varkappa_{\alpha}(p^{*},\mathbf{u})<0,\tau(p^{*})>0$ .

Then $\overline{\mathbf{v}}=\mathbf{v}$ and we have for any $j=1,\ldots,d$ and $v\in[1,\mathbf{v}]$ in view of the definition of $\mathbf{v}$

[TABLE]

in view of (3.25). Hence, (4.19) holds in this case.

Let $\alpha=1,\varkappa_{\alpha}(p^{*},\mathbf{u})<0,\tau(p^{*})\leq 0$ .

Then $\overline{\mathbf{v}}=\mathbf{v}_{2}$ and we have for any $v\in[1,\mathbf{v_{2}}]$ in view of the definition of $\mathbf{v_{2}}$

[TABLE]

and, therefore (4.19) holds in this case.

Let $\varkappa_{\alpha}(p^{*},\mathbf{u})<0,\;\alpha\neq 1,\mathbf{u}<\infty$ . First we note that $\tau(\infty)<0$ and $\mathbf{u}\geq\mathbf{u^{*}}\vee p^{*}$ imply

[TABLE]

since either $\tau(p^{*})\leq 0$ or $\mathbf{u}^{*}>p^{*}$ and $\tau\big{(}\mathbf{u}^{*}\vee p^{*}\big{)}=0$ . Thus $\mathbf{v_{1}}\to\infty,n\to\infty$ and, therefore, for any $v\in[1,\mathbf{v}_{1}]$

[TABLE]

Note that $1-\mathbf{u}/\omega(0)+1/\beta(0)=\varkappa_{0}(p^{*},\mathbf{u})\big{[}1/\mathbf{u}+1/\omega(0)\big{]}-(\mathbf{u}-p^{*})\big{[}1/\mathbf{u}+1/\omega(0)\big{]}$ and, therefore

[TABLE]

which yields $\mathbf{v}_{1}^{-\varkappa_{0}(p^{*},\mathbf{u})}\leq\big{\{}\mathfrak{a}^{-2}\delta_{n}\big{\}}^{-\frac{\mathbf{u}\omega(0)}{\mathbf{u}+\omega(0)}}$ .

It remains to note that if $\tau(\infty)\geq 0$ then $\mathbf{u}^{*}=\infty$ and, therefore $\mathbf{u}=\infty$ . It implies $\mathbf{v}_{1}=1$ and ${\cal I}_{\mathbf{u}}(\alpha)=[\boldsymbol{v},1]$ and this case has been already treated. This completes the proof of (4.19).

$\mathbf{2^{0}}.\;$ Remark that there obviously exists $0<\mathbf{S}:=S\big{(}\vec{\beta},\vec{r},\vec{\mu},p\big{)}<\infty$ independent of $\vec{L}$ such that

[TABLE]

Hence, in view of (4.19) one has for all $n$ large enough and $v\in{\cal I}_{\mathbf{u}}(\alpha)$

[TABLE]

Taking into account that $\boldsymbol{h}_{j}(v,\mathbf{u})\geq e^{-1}\widetilde{\boldsymbol{\eta}}_{j}(v,\mathbf{u})$ and setting $S_{0}=\big{[}\mathbf{S}+2\big{]}\;e^{d+2\sum_{j=1}^{d}\mu_{j}}\boldsymbol{L}^{-\frac{1}{\beta(1)}}$ we obtain from (4.2) for any $\alpha\in[0,1]$ and $v\in{\cal I}_{\mathbf{u}}(\alpha)$

[TABLE]

From now on we choose $\mathfrak{a}\leq a/(2S_{0})<1$ . It yields in view of (4.22) and (4.23)

[TABLE]

$\mathbf{3^{0}}.\;$ Since (4.24) holds, to finish the proof of Proposition (2) all we need to show is that $G_{n}\big{(}\vec{\mathfrak{h}}(\mathbf{v},\mathbf{u})\big{)}\leq a\mathbf{v},\;\;\forall v\in{\cal I}_{\mathbf{u}}(\alpha).$ Let us distinguish three cases.

$\mathbf{3^{0}a}.\;$ Let $p^{*}=\infty$ or $\alpha\neq 1,\mathbf{u}=\infty$ . First we note that in these cases ${\cal I}_{\mathbf{u}}(\alpha)=[\boldsymbol{v},1]$ . Next in view of the second inequality in (4.22), (4.19), (4.23) and (4.24) we obtain

[TABLE]

To get the last inequality we have used that $a<1$ , $\mathfrak{z}(\cdot)\geq 2$ and $v\leq 1$ .

$\mathbf{3^{0}b}.\;$ Let $\alpha\neq 1,p^{*}<\infty,\mathbf{u}<\infty$ . We have in view of the second inequality in (4.22) and (4.23)

[TABLE]

For any $\mathbf{u}\neq\infty$ , simple algebra shows that $v\mathfrak{z}^{-1}(v)=\big{\{}\mathfrak{a}^{-2}\delta_{n}\big{\}}^{\frac{\mathbf{u}\omega(0)}{\mathbf{u}+\omega(0)}}v^{\frac{\mathbf{u}-\omega(0)-\omega(0)/\beta(0)}{\mathbf{u}+\omega(0)}}$ , and since $\mathbf{u}\geq\mathbf{u}^{*}$ , which implies $\mathbf{u}-\omega(0)-\omega(0)/\beta(0)>0$ , the result follows from

[TABLE]

$\mathbf{3^{0}c}.\;$ Let $\alpha=1,p^{*}<\infty$ . We have in view of the second inequality in (4.22) and (4.23)

[TABLE]

where we have denoted $S_{1}=S_{0}\boldsymbol{L}^{-1/\beta(0)}$ .

Our goal now is to show that for all $n$ large enough

[TABLE]

We easily compute for any $v>0$

[TABLE]

Denoting the right hand side of the obtained inequality by $P(v)$ we obviously have

[TABLE]

where $\mathbf{\widetilde{v}}\in\{\mathbf{v_{3}},\overline{\mathbf{v}},\overline{\mathbf{v}}\wedge\mathbf{v_{3}}\}$ . Remarking that $\mathfrak{z}(\boldsymbol{v})=2$ we easily compute that for any $\mathbf{u}\in[1,\infty]$

[TABLE]

Moreover we obviously have

[TABLE]

$\mathbf{3^{0}c1}.\;$ Consider the case $\varkappa_{1}(p^{*},\mathbf{u})\geq 0$ . Here $\widetilde{\mathbf{v}}=\mathbf{v_{3}}$ .

If $\pi(\mathbf{u})\leq 0$ then $\mathbf{v}_{3}=\infty$ and we deduce from (4.31)

[TABLE]

thanks to (4.30). If $\pi(\mathbf{u})>0$ the definition of $\mathbf{v_{3}}$ implies that

[TABLE]

Both last results together with (4.29) and (4.30) prove (4.27) in the case $\varkappa_{1}(p^{*},\mathbf{u})\geq 0$ .

$\mathbf{3^{0}c2}.\;$ Consider the case $\varkappa_{1}(p^{*},\mathbf{u})<0,\;Y\geq[X+1]\mathbf{y}^{-1}-1/\mathbf{u}$ . Here $\widetilde{\mathbf{v}}=\overline{\mathbf{v}}$ .

If $\tau(p^{*})>0$ then $\overline{\mathbf{v}}=\mathbf{v}$ . Moreover $\mathbf{y}=\mathbf{u}^{*}$ since $\mathbf{u}^{*}=\infty$ if $\tau(\infty)\geq 0$ and $\tau(\mathbf{u}^{*})=0$ if $\tau(\infty)<0$ . Hence in view of (3.26) of Lemma 1

[TABLE]

We have in view of the definition of $\mathbf{v}$

[TABLE]

Note that,

[TABLE]

To get the last inequality we have used that

[TABLE]

Thus, we conclude that $P(\mathbf{v})\leq 1,$ which together with (4.30) implies (4.27) in the considered case.

If $\tau(p^{*})<0$ then $\overline{\mathbf{v}}=\mathbf{v_{2}}$ . Moreover $\mathbf{y}=\mathbf{p}^{*}$ . We have in view of the definition of $\mathbf{v_{2}}$

[TABLE]

After routine computations we come to the following equality

[TABLE]

Hence, $P(\mathbf{v_{2}})\leq 1$ for all $n$ large enough, which together with (4.30) allows us to assert (4.27) in the considered case.

$\mathbf{3^{0}c3}.\;$ Consider the case $\varkappa_{1}(p^{*},\mathbf{u})<0,\;Y<[X+1]\mathbf{y}^{-1}-1/\mathbf{u}$ . Here $\widetilde{\mathbf{v}}=\overline{\mathbf{v}}\wedge\mathbf{v_{3}}$ .

If $\pi(\mathbf{u})\leq 0$ the required result follows from (4.32).

If $\pi(\mathbf{u})>0$ then by (4.31) $P(\cdot)$ is strictly increasing and, therefore,

[TABLE]

in view of (4.33). This completes the proof (4.27).

Finally to conclude in the case $\mathbf{3^{0}c}$ , choosing $\mathfrak{a}\leq\sqrt{1/S_{1}}$ , we deduce from (4.26) and (4.27) that for all $n$ large enough

$G_{n}\big{(}\vec{\boldsymbol{h}}(v,\mathbf{u})\big{)}\leq\sqrt{S_{1}}\mathfrak{a}av\leq av,\quad v\in{\cal I}_{\mathbf{u}}(1).$

4.3 Proof of Proposition 3

In view of Lemma 5 in Lepski (2015), if $\tau(p^{*})>0$ then

[TABLE]

where $c_{2}$ is independent on $\vec{L}$ . Note also that $\gamma_{j}\leq\beta_{j}$ for any $j=1,\ldots,d$ .

$\mathbf{1^{0}}.\;$ Let $\big{(}\vec{\pi},\vec{s}\big{)}$ be either $\big{(}\vec{\beta},\vec{r}\big{)}$ or $\big{(}\vec{\gamma},\vec{q}\big{)}$ and without further mentioning the couple $\big{(}\vec{\gamma},\vec{q}\big{)}$ is used below under the condition $\tau(p^{*})>0$ . We obviously have for any $\vec{\mathbf{h}}\in{\cal H}$

[TABLE]

For $j=1,\ldots,d$ we have

[TABLE]

The last equality follows from the definition of the $\ell$ -th order difference operator (2.4). Hence, for any $j\in{\cal J}_{\infty}$ we have in view of the definition of the Nikol’skii class (remind that $\gamma_{j}=\beta_{j},j\in{\cal J}_{\infty}$ )

[TABLE]

This yields for any $\mathbf{h}\in{\cal H}$

[TABLE]

and the first and the second assertions of the proposition are proved for any $j\in{\cal J}_{\infty}$ .

Let $j\in\bar{{\cal J}}_{\infty}$ . Choosing $\mathbf{k}$ from the relation $e^{\mathbf{k}}=\mathbf{h}$ (recall that $\mathbf{h}\in{\cal H}$ ), we have for any $x\in{\mathbb{R}}^{d}$

[TABLE]

We have in view of monotone convergence theorem and the triangle inequality

[TABLE]

By the Minkowski inequality for integrals [see, e.g., (Folland, 1999, Section 6.3)], we obtain

[TABLE]

Taking into account that $f\in{\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}$ and (4.36), we have for any $j=1,\ldots,d,$

[TABLE]

This proves the first and the second assertions of the proposition for any $j\in\bar{{\cal J}}_{\infty}$ .

$\mathbf{2^{0}}.\;$ Set $\mathbb{F}={\mathbb{N}}_{\vec{r},d}\big{(}\vec{\beta},\vec{L}\big{)}$ and recall that

$\mathbf{B}^{*}_{j,s_{j},\mathbb{F}}(\mathbf{h}):=\displaystyle{\sup_{f\in\mathbb{F}}\sum_{h\in{\cal H}:\>h\leq\mathbf{h}}}\bigg{\|}\int_{{\mathbb{R}}}{\cal K}_{\ell}(u)\big{[}f\big{(}x+uh\mathbf{e}_{j}\big{)}-f(x)\big{]}\nu_{1}({\rm d}u)\bigg{\|}_{s_{j}}\leq\sup_{f\in\mathbb{F}}\sum_{h\in{\cal H}:\>h\leq\mathbf{h}}\big{\|}b_{h,f,j}\big{\|}_{s_{j}}.$

Hence, the third assertion follows from (4.37) and (4.38).

5 Appendix

5.1 Proof of Lemma 1

Note that

[TABLE]

and (3.25) follows. On the other hand we have

[TABLE]

and (3.26) is checked if $\tau(\infty)\geq 0$ since $X,Y\geq 0$ . If $\tau(\infty)<0$ and $\tau(p^{*})>0$ then we note first that necessarily $\mathbf{u}^{*}>p^{*}$ since $\tau(\mathbf{u}^{*})=0$ and $\tau(\cdot)$ is strictly decreasing. Hence $\mathbf{y}=\mathbf{u}^{*}$ and

[TABLE]

and (3.26) is established.

Let us prove (3.27). First we note that (3.27) is obvious if $\tau(\infty)\geq 0$ because in this case $\tau(s)>0$ for any $s\geq 1$ . Thus, from now on we will assume that $\tau(\infty)<0$ .

Next, if $\mathbf{u}^{*}>p^{*}$ then (3.27) holds. Indeed, in this case $0<Y-[X+1]\mathbf{y}^{-1}=Y-[X+1](\mathbf{u}^{*})^{-1}$ implies $\mathbf{u}^{*}>(X+1)/Y$ . Hence any number from the interval $\big{(}p^{*}\vee(X+1)/Y,\mathbf{u}^{*}\big{)}$ satisfies (3.27). At last, note that if $p^{*}\geq\mathbf{u}^{*}$ we have

[TABLE]

since $1/\beta(0)\leq p^{*}/\omega(0)$ in view of $r_{j}\leq p^{*}$ for any $j=1,\ldots,d$ . The obtained contradiction completes the proof of (3.27).

5.2 Proof of Lemma 2

Indeed,

[TABLE]

Moreover, in view of the latter inequality

[TABLE]

It remains to note that $1-\big{[}\tau(p_{\pm})\beta(0)p_{\pm}\big{]}^{-1}=\tau(\infty)/\tau(p_{\pm})$ and the lemma follows.

Bibliography42

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1)
2Akakpo (2012) Akakpo, N. (2012). Adaptation to anisotropy and inhomogeneity via dyadic piecewise polynomial selection. Math. Methods Statist. 21 , 1–28.
3Birgé (2008) Birgé, L. (2008). Model selection for density estimation with 𝕃 2 subscript 𝕃 2 {\mathbb{L}}_{2} –loss. ar Xiv:0808.1416 v 2, http://arxiv.org
4Comte and al. (2006) Comte, F., Rozenholc, Y. and Taupin, M.-L. (2006). Penalized contrast estimator for adaptive density deconvolution. Canad. J. Statist. , 34 , 3, 431–452.
5Comte and Lacour (2013) Comte, F. and Lacour, C. (2013). Anisotropic adaptive kernel deconvolution. Ann. Inst. H. Poincaré Probab. Statist. 49 , 2, 569–609.
6Fan (1991) Fan, J. (1991). On the optimal rates of convergence for nonparametric deconvolution problems. Ann. Statist. , 19 , 3, 1257–1272.
7Fan (1993) Fan, J. (1993). Adaptively local one-dimensional subproblems with application to a deconvolution problem. Ann. Statist. , 21 , 2, 600–610.
8Fan and Koo (2002) Fan, J. and Koo, J. (2002). Wavelet deconvolution. IEEE Trans. Inform. Theory , 48 , 734–747.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Estimation in the convolution structure density model. Part II: adaptation over the scale of anisotropic classes.

Abstract

keywords:

keywords:

1 Introduction

1.1 Adaptive estimation

1.2 Historical notes

Direct case, α=0\alpha=0α=0

Intermediate case, α∈(0,1)\alpha\in(0,1)α∈(0,1)

Deconvolution case, α=1\alpha=1α=1

1.3 Lower bound for the minimax Lp{\mathbb{L}}_{p}Lp​-risk

1.3.1 Assumptions on the function ggg imposed in Lepski and Willer (2017)

Assumption 1** (α≠1\alpha\neq 1α=1).**

Assumption 2** (α=1\alpha=1α=1).**

Assumption 3** (α=1\alpha=1α=1).**

Assumption 4** (α=1\alpha=1α=1).**

1.3.2 Some lower bounds from Lepski and Willer (2017)

Theorem 1** (Lepski and Willer (2017)).**

Theorem 2** (Lepski and Willer (2017)).**

1.4 Assumptions on the function ggg

Assumption 5**.**

2 Adaptive estimation over the scale of anisotropic Nikol’skii classes

2.1 Pointwise selection rule

Remark 1**.**

2.2 Anisotropic Nikol’skii classes

Definition 1**.**

2.3 Construction of kernel KKK

Assumption 6**.**

Assumption 7**.**

2.4 Main results

2.4.1 Bounded case

Theorem 3**.**

2.4.2 Unbounded case, α=1\alpha=1α=1

Conjecture 1**.**

Theorem 4**.**

2.4.3 Open problems

Construction of an optimally-adaptive estimator

Adaptive estimation of unbounded functions

Adjustment of ”lower” and ”upper bound” assumptions to each other

3 Proof of Theorems 3 and 4

3.1 Important concepts from Part I and proof outline

3.2 Special set of bandwidths

3.3 Auxiliary statements

Proposition 1**.**

Remark 2**.**

Proposition 2**.**

Proposition 3**.**

Lemma 1**.**

3.4 Concluding remarks

3.5 Proof of Theorem 3

3.6 Proof of Theorem 4

4 Proofs of Propositions 1, 2 and 3

Lemma 2**.**

4.1 Proof of Proposition 1

4.2 Proof of Proposition 2

4.3 Proof of Proposition 3

5 Appendix

5.1 Proof of Lemma 1

5.2 Proof of Lemma 2

Direct case, $\alpha=0$

Intermediate case, $\alpha\in(0,1)$

Deconvolution case, $\alpha=1$

1.3 Lower bound for the minimax ${\mathbb{L}}_{p}$ -risk

1.3.1 Assumptions on the function $g$ imposed in Lepski and Willer (2017)

Assumption 1 ( $\alpha\neq 1$ ).

Assumption 2 ( $\alpha=1$ ).

Assumption 3 ( $\alpha=1$ ).

Assumption 4 ( $\alpha=1$ ).

Theorem 1 (Lepski and Willer (2017)).

Theorem 2 (Lepski and Willer (2017)).

1.4 Assumptions on the function $g$

Assumption 5.

Remark 1.

Definition 1.

2.3 Construction of kernel $K$

Assumption 6.

Assumption 7.

Theorem 3.

2.4.2 Unbounded case, $\alpha=1$

Conjecture 1.

Theorem 4.

Proposition 1.

Remark 2.

Proposition 2.

Proposition 3.

Lemma 1.

Lemma 2.