A Bernstein Inequality For Exponentially Growing Graphs

Johannes T. N. Krebs

arXiv:1701.04188·math.ST·September 20, 2017

A Bernstein Inequality For Exponentially Growing Graphs

Johannes T. N. Krebs

PDF

TL;DR

This paper introduces a Bernstein inequality tailored for sums of random variables on exponentially growing graphs, enabling better concentration bounds in highly-connected network structures.

Contribution

It provides a novel Bernstein inequality applicable to graphs with exponential node growth, aiding in statistical analysis of complex networks.

Findings

01

Derived a Bernstein inequality for exponential graphs

02

Facilitates concentration inequalities in highly-connected networks

03

Supports consistency analysis of nonparametric estimators

Abstract

In this article we present a Bernstein inequality for sums of random variables which are defined on a graphical network whose nodes grow at an exponential rate. The inequality can be used to derive concentration inequalities in highly-connected networks. It can be useful to obtain consistency properties for nonparametric estimators of conditional expectation functions which are derived from such networks.

Equations99

d_{G} : V \times V \to \mathbbm N,

d_{G} : V \times V \to \mathbbm N,

\displaystyle\qquad\qquad(v,w)\mapsto\inf\big{\{}l\in\mathbbm{N}\text{ such that there are }(v_{0},v_{1}),...,(v_{l-1},v_{l})\in E\text{ with }\,v_{0}=v,v_{l}=w\big{\}}.

α_{G} (n) : = I, J \subseteq V, d_{G} (I, J) \geq n sup A \in F (I), B \in F (J) sup ∣ \mathbbm P (A \cap B) - \mathbbm P (A) \mathbbm P (B) ∣, n \in \mathbbm N .

α_{G} (n) : = I, J \subseteq V, d_{G} (I, J) \geq n sup A \in F (I), B \in F (J) sup ∣ \mathbbm P (A \cap B) - \mathbbm P (A) \mathbbm P (B) ∣, n \in \mathbbm N .

V = {(j, k) : j \in \mathbbm N, 1 \leq k \leq A^{j}} and E = {((j, k), (j + 1, k^{'})) : (j, k) \in V and A (k - 1) + 1 \leq k^{'} \leq A k} .

V = {(j, k) : j \in \mathbbm N, 1 \leq k \leq A^{j}} and E = {((j, k), (j + 1, k^{'})) : (j, k) \in V and A (k - 1) + 1 \leq k^{'} \leq A k} .

sup {d_{T} (v, w) ∣ (v, w) \in \tilde{E}} < \infty.

sup {d_{T} (v, w) ∣ (v, w) \in \tilde{E}} < \infty.

sup {d_{G} (v_{i}, w_{i}) : i \in \mathbbm N} < \infty ⟹ sup {d_{\infty, \mathbbm Z^{N}} (v_{i}^{'}, w_{i}^{'}) : i \in \mathbbm N} < \infty.

sup {d_{G} (v_{i}, w_{i}) : i \in \mathbbm N} < \infty ⟹ sup {d_{\infty, \mathbbm Z^{N}} (v_{i}^{'}, w_{i}^{'}) : i \in \mathbbm N} < \infty.

sup {d_{\infty, \mathbbm Z^{N}} (v^{'}, w^{'}) : v \in V, w \in N (v), v (resp. w) is isomorphic to v^{'} (resp. w^{'})} < \infty.

sup {d_{\infty, \mathbbm Z^{N}} (v^{'}, w^{'}) : v \in V, w \in N (v), v (resp. w) is isomorphic to v^{'} (resp. w^{'})} < \infty.

C : = sup {d_{\infty, \mathbbm Z^{N}} (v, w) : d_{G} (v, w) = 1, v, w \in V}

C : = sup {d_{\infty, \mathbbm Z^{N}} (v, w) : d_{G} (v, w) = 1, v, w \in V}

{(I^{'}, J^{'}) : I^{'}, J^{'} \subseteq V^{'} and d_{\infty, \mathbbm Z^{N}} (I^{'}, J^{'}) \geq n} \subseteq {(I, J) : I, J \subseteq V and d_{G} (I, J) \geq C^{- 1} n} .

{(I^{'}, J^{'}) : I^{'}, J^{'} \subseteq V^{'} and d_{\infty, \mathbbm Z^{N}} (I^{'}, J^{'}) \geq n} \subseteq {(I, J) : I, J \subseteq V and d_{G} (I, J) \geq C^{- 1} n} .

L_{k} : = {v \in V ∖ \cup_{i = 0}^{k - 1} L_{i} ∣ \exists w \in L_{k - 1} with w \in N (v)}

L_{k} : = {v \in V ∖ \cup_{i = 0}^{k - 1} L_{i} ∣ \exists w \in L_{k - 1} with w \in N (v)}

2 k < d_{\infty, \mathbbm Z^{N}} (v, w) \leq C d_{G} (v, w) \leq 2 C k .

2 k < d_{\infty, \mathbbm Z^{N}} (v, w) \leq C d_{G} (v, w) \leq 2 C k .

2 C^{2} k < d_{\infty, \mathbbm Z^{N}} (v, w) \leq C d_{G} (v, w) \leq 2 C k;

2 C^{2} k < d_{\infty, \mathbbm Z^{N}} (v, w) \leq C d_{G} (v, w) \leq 2 C k;

V (j, k, P) : = {(j^{'}, k^{'}) \in V ∣ j \leq j^{'} \leq j + P - 1, A^{j^{'} - j} (k - 1) + 1 \leq k^{'} \leq A^{j^{'} - j} k}

V (j, k, P) : = {(j^{'}, k^{'}) \in V ∣ j \leq j^{'} \leq j + P - 1, A^{j^{'} - j} (k - 1) + 1 \leq k^{'} \leq A^{j^{'} - j} k}

N (P, L) : =

N (P, L) : =

= 2 \frac{A ^{P} - A ^{L}}{A - 1} 1_{{L \leq P - 1}} + (L - 1) (A^{P} - A^{L - 1}) \mathbbm 1 {L \leq P}

\displaystyle\quad+\big{(}2(P-1)-L+1\big{)}\left(A^{P-1+\left\lfloor L/2\right\rfloor}-A^{L-1\vee P}\right)\,\cdot\,\,\mathbbm{1}\!\left\{L\geq 4\right\}

\displaystyle\quad-2\frac{\big{\{}\,(A-1)(P-1-\left\lceil L/2\right\rceil)-1\,\big{\}}\,A^{P-1+\left\lfloor L/2\right\rfloor}+A^{L}}{A-1}\,\cdot\,\,\mathbbm{1}\!\left\{L\geq 4\right\}

+ 2 \frac{{ ( A - 1 ) ( P - L ) - 1 } A ^{P} + A ^{L}}{A - 1} \mathbbm 1 {4 \leq L < P}

\leq C P A^{P + L /2},

V^{'} : = V (L, 1, P) \cup \dots \cup V (L, A^{L}, P)

V^{'} : = V (L, 1, P) \cup \dots \cup V (L, A^{L}, P)

\mathbbm P (v \in V^{'} \sum Z_{v} > ε) \leq 2 e^{- β ε} exp {10 e α_{T} (f)^{(P_{2} + Q_{2}) / (2 P_{2} + 2 Q_{2} + A^{L})} \frac{A ^{L}}{P _{2} + Q _{2}}} \cdot exp ⎩ ⎨ ⎧ 4 β^{2} e (P_{2})^{2} \frac{A ^{P} - 1}{A - 1} σ^{2} + 4 C^{2} k = 1 \sum 2 (P - 1) α_{T} (k) N (P, k) (\frac{A ^{L}}{P _{2} + Q _{2}} + 1) ⎭ ⎬ ⎫

\mathbbm P (v \in V^{'} \sum Z_{v} > ε) \leq 2 e^{- β ε} exp {10 e α_{T} (f)^{(P_{2} + Q_{2}) / (2 P_{2} + 2 Q_{2} + A^{L})} \frac{A ^{L}}{P _{2} + Q _{2}}} \cdot exp ⎩ ⎨ ⎧ 4 β^{2} e (P_{2})^{2} \frac{A ^{P} - 1}{A - 1} σ^{2} + 4 C^{2} k = 1 \sum 2 (P - 1) α_{T} (k) N (P, k) (\frac{A ^{L}}{P _{2} + Q _{2}} + 1) ⎭ ⎬ ⎫

β \leq \frac{A - 1}{4 e C P _{2} ( A ^{P} - 1 )} and f : = 2 ⌈ \frac{lo g Q _{2}}{lo g A} ⌉ .

β \leq \frac{A - 1}{4 e C P _{2} ( A ^{P} - 1 )} and f : = 2 ⌈ \frac{lo g Q _{2}}{lo g A} ⌉ .

A (i)

A (i)

B (i)

V_{1}^{'} : = \cup_{i = 1}^{T} A (i) and V_{2}^{'} : = \cup_{i = 1}^{T} B (i) .

V_{1}^{'} : = \cup_{i = 1}^{T} A (i) and V_{2}^{'} : = \cup_{i = 1}^{T} B (i) .

\mathbbm P (v \in V^{'} \sum Z_{v} > ε) \leq \frac{e ^{- β ε}}{2} {\mathbbm E [e^{2 β \sum_{v \in V_{1}^{'}} Z_{v}}] + \mathbbm E [e^{2 β \sum_{v \in V_{2}^{'}} Z_{v}}]} for β > 0.

\mathbbm P (v \in V^{'} \sum Z_{v} > ε) \leq \frac{e ^{- β ε}}{2} {\mathbbm E [e^{2 β \sum_{v \in V_{1}^{'}} Z_{v}}] + \mathbbm E [e^{2 β \sum_{v \in V_{2}^{'}} Z_{v}}]} for β > 0.

S (i) : = v \in \cup_{i = 1}^{i} A (i) \sum Z_{v} and J (i) : = v \in A (i) \sum Z_{v} for i = 1, \dots, T .

S (i) : = v \in \cup_{i = 1}^{i} A (i) \sum Z_{v} and J (i) : = v \in A (i) \sum Z_{v} for i = 1, \dots, T .

\mathbbm E [e^{δ S (i)}]

\mathbbm E [e^{δ S (i)}]

\leq 10 α_{T} (f)^{1/ a} ∥ exp (δ S (i - 1)) ∥_{b} ∥ exp (δ J (i)) ∥_{\infty} + \mathbbm E [e^{δ S (i - 1)}] \mathbbm E [e^{δ J (i)}]

exp δ J (i) \leq 1 + δ J (i) + δ^{2} J (i)^{2} .

exp δ J (i) \leq 1 + δ J (i) + δ^{2} J (i)^{2} .

∣ δ J (i) ∣ \leq δ C P_{2} \frac{A ^{P} - 1}{A - 1} \leq \frac{1}{2 e}, hence, \mathbbm E [δ J (i)] \leq exp (δ^{2} \mathbbm E [J (i)^{2}]) .

∣ δ J (i) ∣ \leq δ C P_{2} \frac{A ^{P} - 1}{A - 1} \leq \frac{1}{2 e}, hence, \mathbbm E [δ J (i)] \leq exp (δ^{2} \mathbbm E [J (i)^{2}]) .

\mathbbm E [J (i)^{2}]

\mathbbm E [J (i)^{2}]

\leq (P_{2})^{2} ⎩ ⎨ ⎧ \frac{A ^{P} - 1}{A - 1} σ^{2} + 4 C^{2} k = 1 \sum 2 (P - 1) α_{T} (k) N (P, k) ⎭ ⎬ ⎫ =: K

\mathbbm E [exp (δ S (i))] \leq {10 α_{T} (f)^{1/ a} ∥ exp (δ J (i) ∥_{\infty} + exp (δ^{2} K)} ∥ exp (δ S (i - 1)) ∥_{b} .

\mathbbm E [exp (δ S (i))] \leq {10 α_{T} (f)^{1/ a} ∥ exp (δ J (i) ∥_{\infty} + exp (δ^{2} K)} ∥ exp (δ S (i - 1)) ∥_{b} .

\mathbbm E [exp (δ S (T))] \leq exp {10 e α_{T} (f)^{1/ (T + 1)} (T - 1) + δ^{2} eK T} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

A Bernstein Inequality For Exponentially Growing Graphs111This research was supported by the Fraunhofer ITWM, 67663 Kaiserslautern, Germany which is part of the Fraunhofer Gesellschaft zur Förderung der angewandten Forschung e.V. The author thanks Hannes Christiansen for proofreading parts of the article.

Johannes T. N. Krebs222Department of Mathematics, University of Kaiserslautern, 67653 Kaiserslautern, Germany, email: [email protected]

Abstract

In this article we present a Bernstein inequality for sums of random variables which are defined on a graphical network whose nodes grow at an exponential rate. The inequality can be used to derive concentration inequalities in highly-connected networks. It can be useful to obtain consistency properties for nonparametric estimators of conditional expectation functions which are derived from such networks.

Keywords: Asymptotic inference; asymptotic inequalities; Bernstein inequality; Concentration inequality; Graphs; Highly-connected graphical networks; Mixing; Nonparametric statistics; Random fields; Stochastic processes

MSC 2010: Primary: 62G20, 62M40, 90B15; Secondary: 62G07, 62G08, 91D30

Inequalities of the Bernstein type are an important tool for the asymptotic analysis in probability theory and statistics. The original inequality derived by Bernstein (1927) gives bounds on $\mathbbm{P}(|S_{n}|>\varepsilon)$ , where $S_{n}=\sum_{k=1}^{n}Z_{k}$ for bounded random variables $Z_{1},\ldots,Z_{n}$ which are i.i.d. and have expectation zero. There are various versions of Bernstein’s inequality, e.g., Hoeffding (1963). In particular, generalizations to different kinds of stochastic processes have gained importance: Carbon (1983), Collomb (1984), Bryc and Dembo (1996) and Merlevède et al. (2009) provide extensions to times series $\{Z_{t}:t\in\mathbbm{Z}\}$ which are weakly dependent. Valenzuela-Domínguez et al. (2017) give a further generalization to strong mixing random fields $\{Z_{s}:s\in\mathbbm{Z}^{N}\}$ which are defined on the regular lattice $\mathbbm{Z}^{N}$ for some lattice dimension $N\in\mathbbm{N}_{+}$ . The corresponding definitions of dependence are given in Doukhan (1994) and in Bradley (2005).

Bernstein inequalities in particular find their applications when deriving large deviation results in nonparametric regression and density estimation, compare Györfi et al. (1989) and Györfi et al. (2002).

In this article we derive a new Bernstein inequality which adapts to highly-connected networks where the number of nodes grows at an exponential rate. A well-known example for such a graph is the internet map which tries to represent the internet with visual graphics. Another application may be nested simulations which are used in insurance mathematics to simulate the outcome of an insurance contract. Based on this new Bernstein inequality, we derive a concentration inequality which ensures that in simulations the nonparametric regression or density function estimator is consistent. It turns out that we need a somewhat stricter decay in the $\alpha$ -mixing coefficients than it is usually assumed in the case for time series. Due to the special geometric structure of the underlying data, many technical aspects in the proofs of these new inequalities are much more involved than it is the case for time series data or for data which is defined on a lattice.

This paper is organized as follows: we give the motivation and the definitions in Section 1. Section 2 contains the new Bernstein inequality and concentration inequalities for exponentially growing graphs, it is the main part of this article. The Appendix A contains a useful result of Davydov (1968).

1 Introduction

In this section we consider a general graph $G=(V,E)$ with a countable set of nodes $V$ and a set of edges $E$ . We define the natural metric on $G$ as the minimal number of edges between two nodes

[TABLE]

The metric $d_{G}$ is extended to sets $I,J\subseteq V$ in the usual way: $d_{G}(I,J)=\inf\{d_{G}(v,w):v\in I,w\in J\}$ . We denote by $\mathcal{N}(v)$ the set of neighbors of $v$ w.r.t. $G$ for a node $v$ of a graph $G=(V,E)$ . Furthermore, we assume that there is a probability space $(\Omega,\mathcal{A},\mathbbm{P})$ which is endowed with a real-valued random field $Z$ . The latter is indexed by the set of nodes $V$ , i.e., $Z$ is a family of random variables $\{Z_{v}:v\in V\}$ such that $Z_{v}:\Omega\rightarrow S$ is measurable for each $v\in V$ . We denote the indicator function by $\mathbbm{1}$ and we define the $\alpha$ -mixing coefficient of the random field $\{Z_{v}:v\in V\}$ on the graph $G=(V,E)$ by

[TABLE]

The random field is strong mixing w.r.t. $G$ if and only if $\alpha_{G}(n)\rightarrow 0$ for $n\rightarrow\infty$ . In the sequel, we investigate random fields which are defined on the following class of graphs:

Definition 1.1 (Trees growing at an exponential rate $A$ ).

Let $A\in\mathbbm{N}_{+}$ . A tree $T=(V,E)$ is growing at an exponential rate $A$ if $T$ is a rooted tree and each node $v\in V$ has exactly $A$ children. The nodes in the tree are labeled according to the following scheme: the distinguished root (which has no parent) is labeled by $(0,0)$ and the children of the node $(j,k)$ are labeled by $(j+1,A(k-1)+1),\ldots,(j+1,Ak)$ . Hence, the set of nodes and the set of edges are given by

[TABLE]

A rooted graph $G=(V,E)$ is growing at an exponential rate $A$ if the edges $E$ can be decomposed into two disjoint sets as $E=E^{\prime}\cup\tilde{E}$ such that $(V,E^{\prime})$ is a tree growing at an exponential rate $A$ and the set $\tilde{E}$ of additional edges has the property that it does not connect nodes of arbitrary length in $T$ , i.e.,

[TABLE]

We come to the definition of a mixing embedding of a graph. Here it is worthy to mention that $-$ especially in the context of graph theory $-$ there are different definitions of graph embeddings: the common definition of an embedding of a graph $G$ requires, loosely speaking, that the edges of the embedded graph may only intersect at their endpoints, i.e., at the nodes. It is well known that any graph with countably many nodes can be embedded into $\mathbbm{Z}^{3}$ via placing the $i$ -th node at the point $(i,i^{2},i^{3})\in\mathbbm{Z}^{3}$ , compare Cohen et al. (1994). Furthermore, one can characterize the finite graphs which are embeddable into the plane (the planar graphs) with the help of the theorems of Kuratowski (1930) and of Wagner (1937). Here, we slightly change this graph theoretic definition such that it is tailored to our needs: we can omit the restriction that edges may not intersect at an interior point. However, since we shall usually be dealing with infinite graphs, we have to add a requirement that is essential when is comes to mixing random fields which are defined on the graph which is to be embedded. We need this definition to show what is intuitively clear: the Bernstein inequalities for regular lattices are not applicable in the context of graphs which grow at an exponential rate. We give the definition

Definition 1.2 (Mixing embedding of a graph).

Let $G=(V,E)$ be a graph with countably many nodes $V$ and denote by $d_{p,\mathbbm{Z}^{N}}$ the Euclidean $p$ -norm on the $N$ -dimensional lattice $\mathbbm{Z}^{N}$ , for $p\geq 1$ and $N\in\mathbbm{N}_{+}$ . There is a mixing embedding of $G$ in $\mathbbm{Z}^{N}$ if there is a dimension $N\in\mathbbm{N}_{+}$ such that $G$ is isomorphic to a graph $G^{\prime}=(V^{\prime},E^{\prime})$ with $V^{\prime}\subseteq\mathbbm{Z}^{N}$ and for each sequence $((v_{i},w_{i}):i\in\mathbbm{N})\subseteq V\times V$ with image $((v^{\prime}_{i},w^{\prime}_{i}):i\in\mathbbm{N})\subseteq V^{\prime}\times V^{\prime}$ it is true that

[TABLE]

In the following, when speaking of the lattice $\mathbbm{Z}^{N}$ as a graph, we shall always understand the graph $G=(V,E)$ with nodes $V=\mathbbm{Z}^{N}$ and edges $E=\{(v,v+b_{i}):v\in V,\;i=1,\ldots,N\}$ where $b_{i}$ is the $i$ -th standard basis vector which is one in the $i$ -th coordinate and zero otherwise. Note that in this case, we have $d_{G}\equiv d_{1,\mathbbm{Z}^{N}}$ and $d_{\infty,\mathbbm{Z}^{N}}\leq d_{1,\mathbbm{Z}^{N}}\leq Nd_{\infty,\mathbbm{Z}^{N}}$ . We have a practical lemma which gives equivalent formulations of this definition

Lemma 1.3.

Let $G$ be a graph. Then the following are equivalent

There is a mixing embedding of $G$ in $\mathbbm{Z}^{N}$ 2. 2.

$G$ * is isomorphic to a graph $G^{\prime}=(V^{\prime},E^{\prime})$ with nodes $V^{\prime}\subseteq\mathbbm{Z}^{N}$ and there is a constant $0<C<\infty$ such that for any $(v,w)\in V\times V$ with image $(v^{\prime},w^{\prime})\in V^{\prime}\times V^{\prime}$ it is true that $d_{\infty,\mathbbm{Z}^{N}}(v^{\prime},w^{\prime})\leq C\,d_{G}(v,w)$ .* 3. 3.

$G$ * is isomorphic to a graph $G^{\prime}=(V^{\prime},E^{\prime})$ with nodes $V^{\prime}\subseteq\mathbbm{Z}^{N}$ and*

[TABLE]

In particular, let $\{Z_{v}:v\in V\}$ be a random field on $G$ , denote by $\{Z_{s}:s\in V^{\prime}\}$ the same random field under the graph isomorphism. Then the mixing coefficients satisfy asymptotically $\alpha_{\infty,\mathbbm{Z}^{N}}(\left\lceil C\,\cdot\,\right\rceil)\leq\alpha_{G}$ which means that strong mixing is inherited when switching between $G$ and $G^{\prime}$ .

Proof.

(1) $\Rightarrow$ (2) and (3): assume that there is a mixing embedding of $G$ in $\mathbbm{Z}^{N}$ , then obviously $V$ is countable, thus, the number

[TABLE]

is meaningful and finite by assumption. Consequently, we have for two connected nodes $v$ and $w$ in $V$ that $d_{\infty,\mathbbm{Z}^{N}}(v,w)\leq C\,d_{G}(v,w)$ . If $v$ and $w$ are not connected then $d_{G}(v,w)=\infty$ . Hence, $C$ is the proper constant. The converse inclusions (2) (resp. (3)) $\Rightarrow$ (1) are immediate.

We come to the amendment of the lemma. Let $n\in\mathbbm{N}$ be given and consider a random field on $G$ and its graph-isomorphic counterpart on $G^{\prime}$ . We infer for two sets $I^{\prime},J^{\prime}\subseteq V^{\prime}$ with preimage $I,J\subseteq V$ and $d_{\infty,\mathbbm{Z}^{N}}(I^{\prime},J^{\prime})\geq n$ that $C\,d_{G}(I,J)\geq d_{\infty,\mathbbm{Z}^{N}}(I^{\prime},J^{\prime})\geq n$ , i.e., we have using the graph isomorphism for $n\geq C$

[TABLE]

Thus, $\alpha_{\infty,\mathbbm{Z}^{N}}(n)\leq\alpha_{G}\left(\left\lfloor C^{-1}\,n\right\rfloor\right)$ for $n\geq C$ . This means that asymptotically $\alpha_{\infty,\mathbbm{Z}^{N}}\leq\alpha_{G}\left(\left\lfloor C^{-1}\,\cdot\,\right\rfloor\right)$ or rather $\alpha_{\infty,\mathbbm{Z}^{N}}\left(\left\lceil C\,\cdot\,\right\rceil\right)\leq\alpha_{G}$ . ∎

The following class of graphs does not allow for a mixing embedding in $\mathbbm{Z}^{N}$

Proposition 1.4.

Let $G=(V,E)$ be a graph with root $v_{0}\in V$ . Put $L_{0}\coloneqq\{v_{0}\}$ and recursively

[TABLE]

for $k\in\mathbbm{N}_{+}$ . If the map $\mathbbm{N}\ni k\mapsto|L_{k}|$ grows faster than any polynomial function of degree $N$ defined on $\mathbbm{N}$ , there is no mixing embedding of $G$ in $\mathbbm{Z}^{N}$ .

Proof.

Let the map $\mathbbm{N}\ni k\mapsto|L_{k}|$ grow faster than any polynomial of degree $N$ and assume that there is a mixing embedding of $G$ in $\mathbbm{Z}^{N}$ for some $0<C<\infty$ which satisfies $d_{\infty,\mathbbm{Z}^{N}}(v,w)\leq Cd_{G}(v,w)$ as stated in Lemma 1.3. First, observe that for $v$ and $w$ both in $L_{k}$ the distance in the graph is at most $d_{G}(v,w)\leq 2k$ . By assumption there is a $k_{1}\in\mathbbm{N}$ such that for all $k\geq k_{1}$ we have $|L_{k}|>(2k+1)^{N}$ . Thus, for $k\geq k_{1}$ there are $v,w\in L_{k}$ with the property that $d_{\infty,\mathbbm{Z}^{N}}(v,w)>2k$ which implies for these two nodes that

[TABLE]

Hence, $C>1$ . In the same way, there is a $k_{2}\in\mathbbm{N}$ such that for all $k\geq k_{2}$ , we have $|L_{k}|>(2C^{2}k+1)^{N}$ . In particular, there are $(v,w)\in L_{k}$ , $k\geq k_{2}$ with the property that $d_{\infty,\mathbbm{Z}^{N}}(v,w)>2C^{2}k$ which implies for $k\geq\max(k_{1},k_{2})$

[TABLE]

which in turn implies $C<1$ . This contradicts the assumption that there is a mixing embedding of $G$ in $\mathbbm{Z}^{N}$ . ∎

This implies that we cannot use the above mentioned Bernstein inequalities for data which is defined on a lattice to derive concentration inequalities for random fields that are defined on graphs which grow at an exponential rate $A$ . Instead we give a new Bernstein inequality which can deal with this class of random fields in the next section.

2 A Bernstein inequality for exponentially growing graphs

In this section we derive inequalities of the Bernstein type for random fields which are highly-connected and whose index set grows at an exponential rate. We need the following important lemma:

Lemma 2.1.

Let $T=(V,E)$ be a tree growing at an exponential rate $A$ . Denote by

[TABLE]

the set of nodes of the subtree of $T$ which has its root at the node $(j,k)$ and consists of $P\in\mathbbm{N}_{+}$ generations. Consider the graph which is induced by the set of nodes $V(j,k,P)$ . Then the number of pairs $(v,w)$ in this graph which are separated by exactly $L$ edges for $1\leq L\leq 2(P-1)$ is given by

[TABLE]

for a suitable constant $0<C<\infty$ which does not depend on $P$ , $L$ and $A$ .

Proof of Lemma 2.1.

The minimal distance in this subtree clearly is 1, whereas the maximal distance is $2(P-1)$ . Let now a length $L$ be fixed, $1\leq L\leq 2(P-1)$ . We distinguish two cases for a pair $(v,w)$ which is separated by $L$ edges: in the first case $w$ (resp. $v$ ) is a descendant of $v$ (resp. $w$ ). In the second case $v$ and $w$ have a common parent which we call $r$ and, plainly, $v\neq r\neq w$ .

The first case is only possible for $1\leq L\leq P-1$ , for such an $L$ there are exactly $2(A^{P}-A^{L})/(A-1)$ such pairs $(v,w)$ in this subtree. The second case is possible for $2\leq L\leq 2(P-1)$ . Depending on $L$ the parent is located between generation zero and generation $\left\lceil P-1-L/2\right\rceil$ , denote its generation by $h$ . Having fixed a parent $r$ in generation $h$ the distance from $r$ to the first node $v$ is at least $1\vee(l-(P-1-h))$ and at most $L\wedge(P-1-h)$ , denote this distance by $i$ . Hence, there are exactly $A^{i}$ nodes in question for $v$ . In this case that $i<L$ the node $w$ is separated $L-i$ generations from $r$ . Since $v\neq w$ and their graph distance is $L$ , this yields $(A-1)A^{L-i-1}$ possibilities for $w$ . All in all, we give the number of pairs with the formula from equation (2.2). ∎

It follows the Bernstein inequality. Here we do not consider the full set of nodes $V$ instead we focus on a strip of $V$ which is defined with the help of the $V(j,k,P)$ from the previous Lemma 2.1.

Theorem 2.2 (Bernstein inequality).

Let $T=(V,E)$ be a tree growing at an exponential rate $A$ . Let $Z_{v}$ be a real-valued random variable for each $v\in V$ with $\mathbbm{E}\left[\,Z_{v}\,\right]=0$ , $\left\lVert Z_{v}\right\rVert_{\infty}\leq C$ and $\text{Var}(Z_{v})\leq\sigma^{2}$ , for some $0<\sigma,C<\infty$ . Let $L\in\mathbbm{N}$ , $P\in\mathbbm{N}_{+}$ and consider the subtree induced by the set of nodes

[TABLE]

with $V(L,i,P)$ as in the definition given in (2.1). Then

[TABLE]

where $Q_{2},P_{2}\in\mathbbm{N}_{+}$ such that $Q_{2}\leq P_{2}$ and $P_{2}+Q_{2}<A^{L}$ as well as

[TABLE]

Proof of Theorem 2.2.

We have to partition $V^{\prime}$ suitably. We use the abbreviations $\widetilde{V}(\,\cdot\,)\coloneqq V(L,\,\cdot\,,P)$ and $T\coloneqq\left\lceil A^{L}/(P_{2}+Q_{2})\right\rceil$ as well as,

[TABLE]

for $i=1,\ldots,T$ . Note that the $A(i)$ and $B(i)$ are the union of the disjoint sets $\widetilde{V}(\,\cdot\,)$ and that some $A(i)$ and $B(i)$ might be empty. Furthermore, we define

[TABLE]

Then, we have with Markov’s inequality and the well-known AM-GM inequality that

[TABLE]

Hence, it suffices to consider the sum $\sum_{v\in V^{\prime}_{1}}Z_{v}$ closer. We write

[TABLE]

We compute the expectations of the random variables $e^{\delta S(i)}$ , for $\delta>0$ sufficiently small. Note that the distance w.r.t. $d_{G}$ between $v\in A(i)$ and $v^{\prime}\in A(i^{\prime})$ , $i\neq i^{\prime}$ , is at least $2\left\lceil\log Q_{2}/\log A\right\rceil$ . Since $S(i)=S(i-1)+J(i)$ , we infer from Davydov’s inequality given in Proposition A.1 that

[TABLE]

for Hölder conjugate $a,b\geq 1$ and $f\coloneqq 2\left\lceil\log Q_{2}/\log A\right\rceil$ . Furthermore, we have if $|\delta J(i)|\leq\ 1/(2e)$ that

[TABLE]

Now the random variables $Z_{v}$ are essentially bounded by $C$ . Let $\beta\leq(A-1)/(4eCP_{2}(A^{P}-1))$ and define $\delta\coloneqq 2\beta$ . Then, we have

[TABLE]

Note that in the subgraph induced by the $A(i)$ there are exactly $N(P,k)$ pairs of nodes $(v,w)$ with $d_{G}(v,w)=k\in\{1,\ldots,2(P-1)\}$ , where $N(P,k)$ is given in Lemma 2.1. For the next two lines we use the inequality $\left(\sum_{i=1}^{n}a_{i}\right)^{2}\leq n^{2}\sum_{i=1}^{n}a_{i}^{2}$ for real numbers $a_{i}$ , $i=1,\ldots,n$ , $n\in\mathbbm{N}$ . Consequently, we get

[TABLE]

with Davydov’s inequality from Proposition A.1.

Furthermore, we find with the Hölder inequality that $\left\lVert\exp(\delta S(i-1))\right\rVert_{1}\leq\left\lVert\exp(\delta S(i-1))\right\rVert_{b}$ . Thus, equation (2.5) can be bounded by

[TABLE]

Especially, for the case $i=T$ successive iteration of (2.6) yields for the choice $a\coloneqq T+1$ and $b=1+1/T$ (as in Valenzuela-Domínguez et al. (2017))

[TABLE]

Next, since $\alpha_{T}(f)\leq 1$ and $1/(1+T)\geq\frac{P_{2}+Q_{2}}{2(P_{2}+Q_{2})+A^{L}}$ , we arrive at

[TABLE]

The computations for $\mathbbm{E}\left[\,\exp\left(2\beta\sum_{v\in V^{\prime}_{2}}Z_{v}\right)\,\right]$ are similar and one achieves the same bounds for this term. This finishes the proof. ∎

We are now in position to derive a concentration inequality. We consider an infinite tree which grows at an exponential rate $A$ and which is endowed with a random field $Z$ . We assume that the random field $Z$ on the tree $T$ is strong mixing such that

[TABLE]

where $N(P,k)$ is defined in Lemma 2.1. We say that the mixing coefficients decay at a super-exponential (or hyper-exponential) rate if there is a positive increasing function $g$ with $\lim_{n\rightarrow\infty}g(n)=\infty$ such that

[TABLE]

In this case, equation (2.7) follows from Lemma 2.1 with the bound $N(P,k)\in\mathcal{O}\left(PA^{P+k/2}\right)$ and the following concentration inequality is true

Theorem 2.3 (Concentration inequality for exponentially growing trees).

Let $T=(V,E)$ be a tree growing at an exponential rate $A$ and let $Z$ be a random field on $T$ as in Theorem 2.2. Let the random field be strong mixing w.r.t. the graph metric with $\alpha$ -mixing coefficients which fulfill (2.7), e.g., the mixing coefficients decay at a super-exponential rate as in (2.8). Consider the subgraph which consists of the first $L$ generations of $T$ for $L\in\mathbbm{N}$

[TABLE]

Then there are constants $c_{1},c_{2}\in\mathbbm{R}_{+}$ such that for all $L\in\mathbbm{N}$ and $\varepsilon>0$

[TABLE]

This means the probability decays asymptotically at a rate which is approximately linear in the size of the sample $V_{L}$ .

Proof of Theorem 2.3.

Let $P_{1}:=\left\lfloor L^{\eta}\right\rfloor$ for some $\eta\in(0,1)$ . We partition $V_{L}$ in the following way: first we define the wedge which consist of the first $L-P_{1}$ generations

[TABLE]

The remaining $P_{1}$ generations are collected in

[TABLE]

The sums which correspond to these partitioning are $\widetilde{S}_{L}\coloneqq\sum_{v\in W_{L}}Z_{v}$ and $S_{L}\coloneqq\sum_{v\in U_{L}}Z_{v}$ . Then we split the probability as follows,

[TABLE]

The first probability in (2.9) is negligible because we find

[TABLE]

Thus, we can focus on the second probability in (2.9). We use Theorem 2.2. We make the following definitions

[TABLE]

Consider the exponent of the first factor given in (2.4): one finds that there is a constant $c\in\mathbbm{R}_{+}$ which does neither depend on $L$ nor on $\varepsilon$ nor on the $Z_{v}$ such that

[TABLE]

The second factor in (2.4) is given by

[TABLE]

We can derive the following bound for the mixing coefficient and the exponent inside the $\exp$ -function in (2.11)

[TABLE]

Consider the second factor inside the $\exp$ -function in (2.11), it is $A^{L-P_{1}}/P_{2}\leq(L-P_{1})/\log L$ . In particular, the second factor in (2.11) is uniformly bounded for all $L\in\mathbbm{N}_{+}$ if $D$ is sufficiently large. Consider the third factor in (2.4). Since the mixing coefficients decay sufficiently fast, we can derive the following inequality

[TABLE]

for a suitable constant $c\in\mathbbm{R}$ . In particular, this expression is uniformly bounded over all $L\in\mathbbm{N}$ . All in all, we have shown that there are constants $c_{1},c_{2}\in\mathbbm{R}_{+}$ such that for the second probability in (2.9) is bounded as

[TABLE]

where, the asymptotic speed is determined by (2.10). This completes the proof. ∎

The previous theorem can be applied to exponentially growing graphs as well, we have the useful corollary:

Corollary 2.4 (Concentration inequality for exponentially growing graphs).

Let $G=(V,E^{\prime}\cup\tilde{E})$ be a graph growing at an exponential rate $A$ endowed with a random field $Z$ as in Theorem 2.3. Then there are constants $c_{1},c_{2}\in\mathbbm{R}_{+}$ such that for all $L\in\mathbbm{N}$ and $\varepsilon>0$

[TABLE]

Proof of Corollary 2.4.

We only need to show that the mixing conditions for the tree $T=(V,E^{\prime})$ are fulfilled. The condition that $S\coloneqq\sup\{d_{T}(v,w)\,|\,(v,w)\in\tilde{E}\}<\infty$ implies that

[TABLE]

In particular, the mixing rates w.r.t. the tree and the whole graph structure satisfy asymptotically the inequality relations $\alpha_{T}(\left\lceil S\cdot\right\rceil)\leq\alpha_{G}\leq\alpha_{T}$ . Thus, we can conclude the statement from Theorem 2.3. ∎

Appendix A Appendix

Proposition A.1 (Davydov (1968)).

Let $(\Omega,\mathcal{A},\mathbbm{P})$ be a probability space and let $\mathcal{G},\mathcal{H}\subseteq\mathcal{A}$ be sub- $\sigma$ -algebras. Denote by $\alpha\coloneqq\sup\{|\mathbbm{P}(A\cap B)-\mathbbm{P}(A)\mathbbm{P}(B)|:\,A\in\mathcal{G},B\in\mathcal{G}\}$ the $\alpha$ -mixing coefficient between $\mathcal{G}$ and $\mathcal{H}$ . Let $p,q,r\geq 1$ be Hölder conjugate. Let $\xi$ (resp. $\eta$ ) be in $L^{p}(\mathbbm{P})$ and $\mathcal{G}$ -measurable (resp. in $L^{q}(\mathbbm{P})$ and $\mathcal{H}$ -measurable). Then

[TABLE]

Bibliography15

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Bernstein (1927) S. Bernstein. Sur l’extension du théorème limite du calcul des probabilités aux sommes de quantités dépendantes. Mathematische Annalen , 97(1):1–59, 1927.
2Bradley (2005) R. C. Bradley. Basic properties of strong mixing conditions. a survey and some open questions. Probability surveys , 2(2):107–144, 2005.
3Bryc and Dembo (1996) W. Bryc and A. Dembo. Large deviations and strong mixing. In Annales de l’IHP Probabilités et statistiques , volume 32, pages 549–569, 1996.
4Carbon (1983) M. Carbon. Inégalité de Bernstein pour les processus fortement mélangeants non nécessairement stationnaires. C.R. Acad. Sc. Paris I , 297:303–306, 1983.
5Cohen et al. (1994) R. F. Cohen, P. Eades, T. Lin, and F. Ruskey. Three-dimensional graph drawing. In International Symposium on Graph Drawing , pages 1–11. Springer, 1994.
6Collomb (1984) G. Collomb. Propriétés de convergence presque complète du prédicteur à noyau. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete , 66(3):441–460, 1984.
7Davydov (1968) Y. A. Davydov. Convergence of distributions generated by stationary stochastic processes. Theory of Probability & Its Applications , 13(4):691–696, 1968.
8Doukhan (1994) P. Doukhan. Mixing, volume 85 of Lecture Notes in Statistics . Springer-Verlag, New York, 1994.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Abstract

1 Introduction

Definition 1.1** (Trees growing at an exponential rate AAA).**

Definition 1.2** (Mixing embedding of a graph).**

Lemma 1.3**.**

Proof.

Proposition 1.4**.**

Proof.

2 A Bernstein inequality for exponentially growing graphs

Lemma 2.1**.**

Proof of Lemma 2.1.

Theorem 2.2** (Bernstein inequality).**

Proof of Theorem 2.2.

Theorem 2.3** (Concentration inequality for exponentially growing trees).**

Proof of Theorem 2.3.

Corollary 2.4** (Concentration inequality for exponentially growing graphs).**

Proof of Corollary 2.4.

Appendix A Appendix

Proposition A.1** (Davydov (1968)).**

Definition 1.1 (Trees growing at an exponential rate $A$ ).

Definition 1.2 (Mixing embedding of a graph).

Lemma 1.3.

Proposition 1.4.

Lemma 2.1.

Theorem 2.2 (Bernstein inequality).

Theorem 2.3 (Concentration inequality for exponentially growing trees).

Corollary 2.4 (Concentration inequality for exponentially growing graphs).

Proposition A.1 (Davydov (1968)).