A Hoeffding's inequality for uniformly ergodic diffusion process

Michael C.H. Choi; Evelyn Li

arXiv:1903.10125·math.PR·March 26, 2019

A Hoeffding's inequality for uniformly ergodic diffusion process

Michael C.H. Choi, Evelyn Li

PDF

TL;DR

This paper extends Hoeffding's inequality to continuous-time uniformly ergodic diffusion processes, providing a new probabilistic bound useful in stochastic process analysis and applications.

Contribution

It introduces a Hoeffding's inequality for diffusion processes, bridging a gap between discrete-time Markov chains and continuous-time diffusions.

Findings

01

Derived a Hoeffding's inequality for diffusion processes.

02

Illustrated the results with examples involving Jacobi diffusion and Ornstein-Uhlenbeck process.

03

Provided bounds for large deviation probabilities in continuous-time settings.

Abstract

In this note, we present a version of Hoeffding's inequality in a continuous-time setting, where the data stream comes from a uniformly ergodic diffusion process. Similar to the well-studied case of Hoeffding's inequality for discrete-time uniformly ergodic Markov chain, the proof relies on techniques ranging from martingale theory to classical Hoeffding's lemma as well as the notion of deviation kernel of diffusion process. We present two examples to illustrate our results. In the first example we consider large deviation probability on the occupation time of the Jacobi diffusion, a popular process used in modelling of exchange rates in mathematical finance, while in the second example we look at the exponential functional of a finite interval analogue of the Ornstein-Uhlenbeck process introduced by Kessler and S{\o}rensen (1999).

Equations78

A := μ (x) \frac{d}{d x} + \frac{1}{2} σ^{2} (x) \frac{d ^{2}}{d x ^{2}},

A := μ (x) \frac{d}{d x} + \frac{1}{2} σ^{2} (x) \frac{d ^{2}}{d x ^{2}},

Q^{♯} := \int_{0}^{\infty} (P^{t} - Π) d t,

Q^{♯} := \int_{0}^{\infty} (P^{t} - Π) d t,

\displaystyle S(x):=\int_{x_{0}}^{x}\exp\bigg{\{}-\int_{x_{0}}^{y}\dfrac{2\mu(z)}{\sigma^{2}(z)}\,dz\bigg{\}}\,dy,\quad M(x):=\int_{l}^{x}\dfrac{2}{\sigma^{2}(y)}\exp\bigg{\{}\int_{x_{0}}^{y}\dfrac{2\mu(z)}{\sigma^{2}(z)}\,dz\bigg{\}}\,dy,

\displaystyle S(x):=\int_{x_{0}}^{x}\exp\bigg{\{}-\int_{x_{0}}^{y}\dfrac{2\mu(z)}{\sigma^{2}(z)}\,dz\bigg{\}}\,dy,\quad M(x):=\int_{l}^{x}\dfrac{2}{\sigma^{2}(y)}\exp\bigg{\{}\int_{x_{0}}^{y}\dfrac{2\mu(z)}{\sigma^{2}(z)}\,dz\bigg{\}}\,dy,

\displaystyle s(x):=\dfrac{d}{dx}S(x)=\exp\bigg{\{}-\int_{x_{0}}^{x}\dfrac{2\mu(z)}{\sigma^{2}(z)}\,dz\bigg{\}},\quad m(x):=\dfrac{d}{dx}M(x)=\dfrac{2}{\sigma^{2}(x)s(x)}.

\displaystyle s(x):=\dfrac{d}{dx}S(x)=\exp\bigg{\{}-\int_{x_{0}}^{x}\dfrac{2\mu(z)}{\sigma^{2}(z)}\,dz\bigg{\}},\quad m(x):=\dfrac{d}{dx}M(x)=\dfrac{2}{\sigma^{2}(x)s(x)}.

x \in S sup ∣∣ P^{t} (x, \cdot) - π ∣ ∣_{T V} ⩽ C e^{- β t},

x \in S sup ∣∣ P^{t} (x, \cdot) - π ∣ ∣_{T V} ⩽ C e^{- β t},

t_{a v} := \int_{S \times S} E_{x} [τ_{y}] π (d x) π (d y)

t_{a v} := \int_{S \times S} E_{x} [τ_{y}] π (d x) π (d y)

i ⩾ 1 \sum \frac{1}{λ _{i}} < \infty,

i ⩾ 1 \sum \frac{1}{λ _{i}} < \infty,

\int_{S} m ([x, \infty]) s (x) d x = \frac{1}{( γ - 1 ) ( γ - 2 )} < \infty,

\int_{S} m ([x, \infty]) s (x) d x = \frac{1}{( γ - 1 ) ( γ - 2 )} < \infty,

∣∣ Q^{♯} ∣∣ ⩽ 2 t_{a v} < \infty,

∣∣ Q^{♯} ∣∣ ⩽ 2 t_{a v} < \infty,

P_{x} (\frac{1}{t} \int_{0}^{t} f (X_{s}) d s - π (f) ⩾ ε) ⩽ exp {\frac{- 2 ( tε - 2 ∥ f ∥ Q ^{♯} ) ^{2}}{( t + 1 ) ∥ f ∥ ^{2} ( 2 ∥ Q ^{♯} ∥ + 1 ) ^{2}}},

P_{x} (\frac{1}{t} \int_{0}^{t} f (X_{s}) d s - π (f) ⩾ ε) ⩽ exp {\frac{- 2 ( tε - 2 ∥ f ∥ Q ^{♯} ) ^{2}}{( t + 1 ) ∥ f ∥ ^{2} ( 2 ∥ Q ^{♯} ∥ + 1 ) ^{2}}},

A^{J} = (a - b x) \frac{d}{d x} + \frac{σ ^{2}}{2} x (1 - x) \frac{d ^{2}}{d x ^{2}},

A^{J} = (a - b x) \frac{d}{d x} + \frac{σ ^{2}}{2} x (1 - x) \frac{d ^{2}}{d x ^{2}},

λ_{i}

λ_{i}

P_{x} (\frac{1}{t} \int_{0}^{t} \mathds 1_{A} (X_{s}) d s - π (\mathds 1_{A}) ⩾ ε) ⩽ exp {\frac{- 2 ( tε - 4 t _{a v} ) ^{2}}{( t + 1 ) ( 4 t _{a v} + 1 ) ^{2}}},

P_{x} (\frac{1}{t} \int_{0}^{t} \mathds 1_{A} (X_{s}) d s - π (\mathds 1_{A}) ⩾ ε) ⩽ exp {\frac{- 2 ( tε - 4 t _{a v} ) ^{2}}{( t + 1 ) ( 4 t _{a v} + 1 ) ^{2}}},

t_{a v} = \frac{2}{σ ^{2}} i = 1 \sum \infty \frac{1}{i ( i - 1 + \frac{2 b}{σ ^{2}} )}

t_{a v} = \frac{2}{σ ^{2}} i = 1 \sum \infty \frac{1}{i ( i - 1 + \frac{2 b}{σ ^{2}} )}

A^{O} = - ρ tan (x) \frac{d}{d x} + \frac{1}{2} \frac{d ^{2}}{d x ^{2}} .

A^{O} = - ρ tan (x) \frac{d}{d x} + \frac{1}{2} \frac{d ^{2}}{d x ^{2}} .

λ_{i}

λ_{i}

π (x) = \frac{cos ( x )}{2} \mathds 1_{x \in (- π /2, π /2)} .

π (x) = \frac{cos ( x )}{2} \mathds 1_{x \in (- π /2, π /2)} .

\int_{0}^{t} f (X_{s}) d s = \int_{0}^{t} e^{u X_{s}} d s,

\int_{0}^{t} f (X_{s}) d s = \int_{0}^{t} e^{u X_{s}} d s,

P_{x} (\int_{0}^{t} e^{u X_{s}} d s - \frac{2 t cosh ( u π /2 )}{1 + u ^{2}} ⩾ tε) ⩽ exp {\frac{- 2 ( tε - 4 e ^{u π /2} t _{a v} ) ^{2}}{( t + 1 ) e ^{u π} ( 4 t _{a v} + 1 ) ^{2}}},

P_{x} (\int_{0}^{t} e^{u X_{s}} d s - \frac{2 t cosh ( u π /2 )}{1 + u ^{2}} ⩾ tε) ⩽ exp {\frac{- 2 ( tε - 4 e ^{u π /2} t _{a v} ) ^{2}}{( t + 1 ) e ^{u π} ( 4 t _{a v} + 1 ) ^{2}}},

t_{a v} = i = 1 \sum \infty \frac{2}{i ( i + 1 )} = 2

t_{a v} = i = 1 \sum \infty \frac{2}{i ( i + 1 )} = 2

\hat{f} ⩽ ∥ f ∥ Q^{♯},

\hat{f} ⩽ ∥ f ∥ Q^{♯},

P_{x} (\frac{1}{t} \int_{0}^{t} f (X_{s}) d s - π (f) ⩾ ε)

P_{x} (\frac{1}{t} \int_{0}^{t} f (X_{s}) d s - π (f) ⩾ ε)

= e^{- θ tε} E_{x} [e^{- θ \int_{0}^{t} A \hat{f} (X_{s}) d s}] .

M_{t}^{\hat{f}} := \hat{f} (X_{t}) - \hat{f} (X_{0}) - \int_{0}^{t} A \hat{f} (X_{s}) d s .

M_{t}^{\hat{f}} := \hat{f} (X_{t}) - \hat{f} (X_{0}) - \int_{0}^{t} A \hat{f} (X_{s}) d s .

e^{- θ tε} E_{x} [e^{- θ \int_{0}^{t} A \hat{f} (X_{s}) d s}]

e^{- θ tε} E_{x} [e^{- θ \int_{0}^{t} A \hat{f} (X_{s}) d s}]

⩽ e^{- θ tε} e^{2 θ ∥ Q^{♯} ∥ ∥ f ∥} E_{x} [e^{θ M_{t}^{\hat{f}}}] .

M_{t}^{\hat{f}}

M_{t}^{\hat{f}}

M_{s}^{\hat{f}} - M_{s - 1}^{\hat{f}}

M_{s}^{\hat{f}} - M_{s - 1}^{\hat{f}}

M_{s}^{\hat{f}} - M_{s - 1}^{\hat{f}}

⩽ 2 Q^{♯} ∥ f ∥ + ∥ f ∥

= (2 Q^{♯} + 1) ∥ f ∥,

M_{t}^{\hat{f}} - M_{⌊ t ⌋}^{\hat{f}}

M_{t}^{\hat{f}} - M_{⌊ t ⌋}^{\hat{f}}

M_{t}^{\hat{f}} - M_{⌊ t ⌋}^{\hat{f}}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

A Hoeffding’s inequality for uniformly ergodic diffusion process

Michael C.H. Choi, Evelyn Li

Institute for Data and Decision Analytics, The Chinese University of Hong Kong, Shenzhen, Guangdong, 518172, P.R. China

[email protected]

School of Mathematics, Sun Yat-Sen University, Guangzhou, Guangdong, 510275, P.R. China

[email protected]

Abstract.

In this note, we present a version of Hoeffding’s inequality in a continuous-time setting, where the data stream comes from a uniformly ergodic diffusion process. Similar to the well-studied case of Hoeffding’s inequality for discrete-time uniformly ergodic Markov chain, the proof relies on techniques ranging from martingale theory to classical Hoeffding’s lemma as well as the notion of deviation kernel of diffusion process. We present two examples to illustrate our results. In the first example we consider large deviation probability on the occupation time of the Jacobi diffusion, a popular process used in modelling of exchange rates in mathematical finance, while in the second example we look at the exponential functional of a finite interval analogue of the Ornstein-Uhlenbeck process introduced by Kessler and Sørensen (1999).

AMS 2010 subject classifications: 60F10, 60J10, 60G44

Keywords: Diffusion process; Hoeffding’s inequality; large deviations

1. Introduction and main results

The seminal work of Hoeffding (1963), which gives bound on large deviation probability of sum of bounded random variables, has now became one among many classical tools in probability theory. In particular, it has far reaching applications in statistics and machine learning, see for instance Devroye et al. (1996) and the references therein. Hoeffding’s inequality has then been refined and extended to various settings. For example, motivated by applications in Markov decision processes and reinforcement learning Glynn and Ormoneit (2002) derives a Hoeffding’s inequality for uniformly ergodic Markov chain, while Boucher (2009) presents another method to prove Hoeffding’s inequality in terms of the Drazin inverse of Markov chain.

Inspired by the work cited above, we aim at extending Hoeffding’s inequality to the setting of diffusion process. Contrary to the classical setting, we assume that the data stream arrives continuously from a uniformly ergodic diffusion process. The major difficulty of the analysis is then twofold. First, as we have a continuous data stream instead of discrete data points, previous analysis does not carry over to this setting easily. In addition, the dependency within the data stream complicates the situation. To overcome these difficulties, we employ classical martingale techniques for diffusion process as well as the notion of deviation kernel to aid our analysis. Comparing our result with the existing literature on concentration inequalities for diffusion processes Galtchouk and Pergamenshchikov (2007), we argue that our proof is conceptually simpler since it utilizes similar techniques as in the discrete-time Markov chain case Glynn and Ormoneit (2002); Boucher (2009). In addition, as we shall see in Corollary 1.1 below, it is readily applicable as long as we have the relevant eigenvalue information of the generator of the diffusion.

To this end, we fix our notation and introduce the tools we need for our main result Theorem 1.1 below. Let $(\Omega,\mathcal{F},(\mathcal{F}_{t})_{t\geqslant 0},\mathbb{P})$ be a filtered probability space satisfying the usual conditions. Suppose that we have an ergodic diffusion process $X=(X_{t})_{t\geqslant 0}$ on state space $S$ with transition kernel $P$ , transition density $P(x,dy)$ and stationary distribution $\pi$ . We write $\mathbb{P}_{x}\left(\cdot\right):=\mathbb{P}\left(\cdot\mid X_{0}=x\right)$ and $\mathbb{E}_{x}\left(\cdot\right):=\mathbb{E}\left[\cdot\mid X_{0}=x\right]$ to be the conditional probability and expectation when the process is initialized at $X_{0}=x\in S$ . $X$ is characterized by the infinitesimal generator $\mathcal{A}$ , which acts on the space of twice differentiable functions and is defined to be

[TABLE]

where $\mu(x)$ and $\sigma^{2}(x)$ are respectively known as the drift and diffusion coefficient of $X$ . A tool that we will use in the main result below is the deviation kernel $Q^{\sharp}$ of $X$ , which is defined as

[TABLE]

where $\Pi$ is the projection kernel with density $\Pi(x,dy)=\pi(dy)$ for all $x\in S$ . It is well-known that the function $\hat{f}:=Q^{\sharp}f$ solves the Poisson equation $-\mathcal{A}\hat{f}=f$ , see e.g. Glynn and Meyn (1996). For bounded function $f$ , we define the supremum norm to be $\left\|f\right\|:=\sup_{x}\left|f(x)\right|$ . We also write $\left\|Q^{\sharp}\right\|:=\sup_{\left\|f\right\|\leqslant 1}\left\|Q^{\sharp}f\right\|$ to be the induced operator norm of $Q^{\sharp}$ on the space of bounded functions. For further references on $Q^{\sharp}$ , we refer readers to the work of Cheng and Mao (2015); Whitt (1992); Mao (2002). On one-dimensional state space $S=(l,u)$ , we now recall two fundamental notions associated with the diffusion $X$ , namely the scale function $S(x)$ and the speed function $M(x)$ . For $x\in\mathcal{X}$ , these functions are defined by

[TABLE]

where $x_{0}\in S$ is a fixed and arbitrary reference point. Their respective densities are given by

[TABLE]

In this note, we are primarily interested in uniformly ergodic diffusions. That is, it is the class of ergodic diffusions such that the convergence to equilibrium in total variation distance is uniformly bounded by, for $t\geqslant 0$ and some constants $C<\infty$ , $\beta>0$ ,

[TABLE]

where $||P^{t}(x,\cdot)-\pi||_{TV}:=\sup_{A}|P^{t}(x,A)-\pi(A)|$ is the total variation distance between $P^{t}(x,\cdot)$ and $\pi$ . We write $\tau_{y}:=\inf\{t\geqslant 0;~{}X_{t}=y\}$ to be the first hitting time of $y$ and

[TABLE]

to be the average hitting time of $X$ . While verifying uniform ergodicity can be quite difficult, it turns out that, according to (Cheng and Mao, 2015, Theorem 2.2), uniform ergodicity for diffusion on $(0,u)$ with reflecting boundary at [math] is equivalent to a few readily checkable conditions on $t_{av}$ , $s(x)$ and $m(x)$ :

Proposition 1.1 (Necessary and sufficient conditions for uniform ergodicity Cheng and Mao (2015)).

Given a ergodic diffusion $X$ on $S=(0,u)$ with reflecting boundary at [math], the following statements are equivalent:

(1)

$X$ * is uniformly ergodic;* 2. (2)

$\int_{S}m([x,u])s(x)\,dx<\infty$ ; 3. (3)

$t_{av}<\infty$ ; 4. (4)

$\sigma_{ess}(\mathcal{A})=\emptyset$ * and*

[TABLE]

where $\sigma_{ess}(\mathcal{A})$ is the essential spectrum of $\mathcal{A}$ and $(\lambda_{i})_{i\geqslant 1}$ are the non-zero eigenvalues of $-\mathcal{A}$ .

At times it maybe easier to check item (2) as it depends on $\mu(x)$ and $\sigma^{2}(x)$ through $s(x)$ and $m(x)$ , while at other times when eigenvalues information are available perhaps it is more convenient to check item (4). As a simple illustration of item (2), we consider the class of diffusions with $\mu(x)=0,\sigma^{2}(x)=2(1+x)^{\gamma}$ and $S=(0,\infty)$ , where $\gamma>2$ is a parameter. This class is first studied in Mao (2002). It is easy to see that $s(x)=1$ and $m(x)=(1+x)^{-\gamma}$ . As a result, item (2) now reads

[TABLE]

and so this class of diffusions with $\gamma>2$ are uniformly ergodic. For illustration of item (4), we defer the readers to Corollary 1.1 when we discuss the Jacobi process. In view of Proposition 1.1, for uniformly ergodic $X$ we have

[TABLE]

where the first inequality follows from (Choi, 2018, Theorem $1.1$ ). In other words, for uniformly ergodic diffusion the induced operator norm $||Q^{\sharp}||$ is finite. Such a term will appear in our version of Hoeffding’s inequality Theorem 1.1 below.

With the above notation, we are now ready to state our main result. It follows from the classical ergodic theorem that for bounded function $f$ , the time average $\frac{1}{t}\int_{0}^{t}f(X_{s})ds$ converges almost surely to the space average $\pi(f):=\int_{S}f(x)\pi(dx)$ as $t\to\infty$ , see e.g. (Bhattacharya and Waymire, 2009, Theorem $12.2$ ). In our main result below, we present non-asymptotic probabilistic error bound of such convergence:

Theorem 1.1.

Suppose that $X=\left(X_{t}\right)_{t\geqslant 0}$ is a uniformly ergodic diffusion and $f$ is a bounded function. For any $\varepsilon>0$ , $t>\frac{2\left\|f\right\|\left\|Q^{\sharp}\right\|}{\varepsilon}$ , $x\in S$ ,

[TABLE]

where $Q^{\sharp}$ is the deviation kernel of the process $X$ .

*Remark 1.1** (On the assumption of bounded $f$ ).*

As usual in the Hoeffding’s inequality literature, our main result Theorem 1.1 requires the function $f$ to be bounded. This assumption is crucial when we apply the classical Hoeffding’s lemma (Devroye et al., 1996, Lemma $8.1$ ) to certain martingale difference sequence in (2.6) and (2.7) below, which only holds when the random variable of interest is bounded. Although there is extension of the Hoeffding’s lemma to non-negative random variable with finite mean Bentkus (2008), this result is however difficult to apply in our setting as one need to find random variables that stochastically dominate the martingale difference sequence. We leave this question of extending the main result to unbounded $f$ as future work.

As our first example to illustrate our main result Theorem 1.1, we investigate the Jacobi process $X=\left(X_{t}\right)_{t\geqslant 0}$ on the state space $S=(0,1)$ . The generator of this process is given by

[TABLE]

where $a,b,\sigma\in\mathbb{R}$ are parameters of $X$ and are assumed to take on values such that $\alpha:=\frac{2b}{\sigma^{2}}-\frac{2a}{\sigma^{2}}-1>-1$ and $\beta:=\frac{2a}{\sigma^{2}}-1>-1$ , i.e. $b>a>0$ and $\sigma\in\mathbb{R}$ . With these choices of parameters, the stationary distribution of Jacobi process is the Beta distribution with parameters $\alpha+1$ and $\beta+1$ , where its density is governed by $\pi(x)=\frac{x^{\beta}(1-x)^{\alpha}}{B\left(\alpha+1,\beta+1\right)}$ , and $B(\cdot,\cdot)$ denotes the Beta function. According to (Albanese and Kuznetsov, 2009, Appendix $B.3$ ) and (Forman and Sørensen, 2008, Section $2.1$ ) $X$ is ergodic with $\sigma_{ess}(\mathcal{A}^{J})=\emptyset$ and

[TABLE]

In view of Proposition 1.1 item (4), $X$ is thus uniformly ergodic. One major motivation for us to study such a process stems from its usage in financial modelling, where a more general form of Jacobi process has been employed to model exchange rates in a target zone, see Larsen and Sørensen (2007) and the references therein. In these models, one is often interested in the long-run average of the occupation time of the process in certain region $A$ , say the occupation time of the exchange rate above or below a threshold. Unfortunately, distributional information on the functional $\int_{0}^{t}\mathds{1}_{A}(X_{s})\,ds$ is often inaccessible, where $\mathds{1}_{A}$ is the indicator function of the set $A$ . In practice, one may resort to the space average $\pi(\mathds{1}_{A})$ as a natural approximation of the quantity of interest $\int_{0}^{t}\mathds{1}_{A}(X_{s})\,ds$ , where the former is often easier to access than the latter. Our main result in Theorem 1.1 thus provides an invaluable tool and can be used to give non-asymptotic probabilistic error bounds on such approximation. Another situation where Theorem 1.1 is needed is about constructing confidence interval of the functional $\int_{0}^{t}\mathds{1}_{A}(X_{s})\,ds$ . One can easily construct confidence band based on these large deviation probability. With these motivations in mind, we now apply Theorem 1.1 to the Jacobi process with $f=\mathds{1}_{A}$ that gives:

Corollary 1.1.

Suppose that $X=\left(X_{t}\right)_{t\geqslant 0}$ is the Jacobi process which is uniformly ergodic with generator given by (1.1) and parameters $b>a>0,\sigma\in\mathbb{R}$ . For any $\varepsilon>0$ , $t>\frac{4t_{av}}{\varepsilon}$ and measurable subset $A\subseteq(0,1)$ , we have

[TABLE]

where

[TABLE]

is the average hitting time of the Jacobi process.

As our first remark, we note that the upper bound in (1.2) can be quite loose since it does not depend on the size of $A$ . Such a bound indeed holds as long as we have $||f||\leqslant 1$ in Theorem 1.1. In addition, we see that this upper bound depends only on the parameters $b$ and $\sigma$ through $t_{av}$ but not on $a$ .

In our second example, we introduce the finite interval analogue of the Ornstein-Uhlenbeck process first studied by Kessler and Sørensen (1999) on the state space $S=(-\pi/2,\pi/2)$ , where we take the drift to be $\mu(x)=-\rho\tan(x)$ , the diffusion coefficient to be $\sigma^{2}(x)=1$ and $\rho\geqslant 1/2$ to be a parameter. That is, the generator $\mathcal{A}^{O}$ is written as

[TABLE]

According to (Forman and Sørensen, 2008, Section $2.1$ ) $X$ is ergodic with $\sigma_{ess}(\mathcal{A}^{O})=\emptyset$ and

[TABLE]

In view of Proposition 1.1 item (4), $X$ is thus uniformly ergodic for any $\rho\geqslant 1/2$ . Specializing into the case $\rho=1/2$ , we see that the stationary distribution has density given by

[TABLE]

For $u\in\mathbb{R}$ , if we take $f(x)=e^{ux}\mathds{1}_{x\in(-\pi/2,\pi/2)}$ with $||f||\leqslant e^{u\pi/2}$ in Theorem 1.1, the time integral becomes

[TABLE]

the exponential functional associated with $X$ . Often distributional information of exponential functionals are difficult to obtain, see for instance the book Yor (2001). One may approximate such functional by means of their space average $\pi(f)$ , and our results come in handy since they give probabilistic error bound on such approximation. Theorem 1.1 now reads

Corollary 1.2.

Suppose that $X=\left(X_{t}\right)_{t\geqslant 0}$ is the finite interval analogue of the Ornstein-Uhlenbeck process which is uniformly ergodic with generator given by (1.3) and parameter $\rho=1/2$ . For any $\varepsilon>0$ , $u\in\mathbb{R}$ and $t>\frac{4e^{u\pi/2}t_{av}}{\varepsilon}$ , we have

[TABLE]

where

[TABLE]

is the average hitting time of $X$ .

For further concrete examples of uniformly ergodic diffusions with explicit eigenvalues information, we refer interested readers to the work of Kessler and Sørensen (1999); Forman and Sørensen (2008).

The rest of the paper is organized as follows. In Section 2, we first present the proof of the main result Theorem 1.1, followed by detailing the proof of Corollary 1.1 and the proof of Corollary 1.2.

2. Proof of the main results

2.1. Proof of Theorem 1.1

Suppose without loss of generality that the mean of $f$ with respect to $\pi$ is zero, that is, $\pi(f)=0$ . To begin with, it follows readily from the induced operator norm of $Q^{\sharp}$ that we have

[TABLE]

where we recall $\hat{f}=Q^{\sharp}f$ is the solution to the Poisson equation. Now, for the large deviation probability, we see that

[TABLE]

In the above equation, (2.2) comes from Markov inequality, which holds for any $\theta\geqslant 0$ , while (2.3) follows from the Poisson equation $-\mathcal{A}\hat{f}=f$ . Now, we explicitly construct a martingale that is useful in our analysis, namely

[TABLE]

Then by a classical result in (Bhattacharya and Waymire, 2009, Chapter $5$ Theorem $2.3$ ), we see that $\mathcal{M}_{t}^{\hat{f}}$ is a mean zero $\left\{\mathcal{F}_{t}\right\}$ -martingale, where we again recall $\mathcal{F}_{t}=\sigma\left\{X_{u}:0\leqslant u\leqslant t\right\}$ is the filtration of $X$ . Using (2.1) and (2.4), the tail bound in (2.3) is further upper bounded by

[TABLE]

Now, we proceed to examine the bound for $\mathbb{E}_{x}\left[e^{\theta\mathcal{M}_{t}^{\hat{f}}}\right]$ . In order to use the classical Hoeffding’s lemma for bounded random variables (Devroye et al., 1996, Lemma $8.1$ ), we write $\mathcal{M}_{t}^{\hat{f}}$ as

[TABLE]

As a result, to bound the martingale $\mathcal{M}^{\hat{f}}_{t}$ it suffices to bound the martingale differences $\mathcal{M}^{\hat{f}}_{s}-\mathcal{M}^{\hat{f}}_{s-1}$ . Using the definition of $\mathcal{M}_{t}^{\hat{f}}$ in (2.4), these bounds are given by, for $s=1,2,...\left\lfloor t\right\rfloor$ ,

[TABLE]

where we use the Poisson equation in the first inequality and (2.1) in the second inequality. Similarly,

[TABLE]

It follows from double expectation, (2.5), (2.6) and (2.7) that the upper bound in (2.3) becomes

[TABLE]

where the first and second inequality follows from repeated applications of the Hoeffding’s lemma (Devroye et al., 1996, Lemma $8.1$ ). Finally, collecting the above results the tail bound is given by

[TABLE]

which is minimized at $\theta=\theta^{*}$ where

[TABLE]

Desired result follows by substituting $\theta=\theta^{*}$ into (2.8).

2.2. Proof of Corollary 1.1

Desired result follows from taking $f=\mathds{1}_{A}$ in Theorem 1.1 and utilizing the follow bound on the induced operator norm of the deviation kernel $Q^{\sharp}$ :

[TABLE]

see e.g. (Choi, 2018, Theorem $1.1$ ). As for the expression of the average hitting time $t_{av}$ , the eigentime identity Cheng and Mao (2015) gives

[TABLE]

where $0=\lambda_{0}<\lambda_{1}\leqslant\lambda_{2}\leqslant\ldots$ are the eigenvalues of $-\mathcal{A}^{J}$ which are given by, for $i=0,1,2,\ldots$ ,

[TABLE]

see e.g. (Albanese and Kuznetsov, 2009, Appendix $B.3$ ).

2.3. Proof of Corollary 1.2

Desired result follows from taking $f(x)=e^{ux}\mathds{1}_{x\in(-\pi/2,\pi/2)}$ in Theorem 1.1, and using

[TABLE]

as well as the following bound on the induced operator norm of the deviation kernel $Q^{\sharp}$ :

[TABLE]

where again the first equality follows from Cheng and Mao (2015) with $\lambda_{i}$ being given in (1.4) with parameter $\rho=1/2$ .

Acknowledgement

Acknowledgement. We thank the anonymous referee for constructive comments that improve the presentation of the manuscript. This work is partially supported by the Chinese University of Hong Kong, Shenzhen grant PF01001143.

Bibliography17

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Albanese and Kuznetsov (2009) C. Albanese and A. Kuznetsov. Transformations of Markov processes and classification scheme for solvable driftless diffusions. Markov Process. Related Fields , 15(4):563–574, 2009.
2Bentkus (2008) V. Bentkus. An extension of the Hoeffding inequality to unbounded random variables. Lith. Math. J. , 48(2):137–157, 2008.
3Bhattacharya and Waymire (2009) R. N. Bhattacharya and E. C. Waymire. Stochastic processes with applications , volume 61. Siam, 2009.
4Boucher (2009) T. R. Boucher. A Hoeffding inequality for Markov chains using a generalized inverse. Statist. Probab. Lett. , 79(8):1105–1107, 2009.
5Cheng and Mao (2015) L.-J. Cheng and Y.-H. Mao. Eigentime identity for one-dimensional diffusion processes. J. Appl. Probab. , 52(1):224–237, 2015.
6Choi (2018) M. C. Choi. A scale function approach for Stein’s method of one-dimensional diffusion. Submitted , 2018.
7Devroye et al. (1996) L. Devroye, L. Györfi, and G. Lugosi. A probabilistic theory of pattern recognition , volume 31 of Applications of Mathematics (New York) . Springer-Verlag, New York, 1996.
8Forman and Sørensen (2008) J. L. Forman and M. Sørensen. The Pearson diffusions: a class of statistically tractable diffusion processes. Scand. J. Statist. , 35(3):438–465, 2008.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

A Hoeffding’s inequality for uniformly ergodic diffusion process

Abstract.

1. Introduction and main results

Proposition 1.1** (Necessary and sufficient conditions for uniform ergodicity Cheng and Mao (2015)).**

Theorem 1.1**.**

Remark 1.1* (On the assumption of bounded fff).*

Corollary 1.1**.**

Corollary 1.2**.**

2. Proof of the main results

2.1. Proof of Theorem 1.1

2.2. Proof of Corollary 1.1

2.3. Proof of Corollary 1.2

Acknowledgement

Proposition 1.1 (Necessary and sufficient conditions for uniform ergodicity Cheng and Mao (2015)).

Theorem 1.1.

*Remark 1.1** (On the assumption of bounded $f$ ).*

Corollary 1.1.

Corollary 1.2.