A Bernstein type inequality for sums of selections from three   dimensional arrays

Debapratim Banerjee; Matteo Sordello

arXiv:1907.08729·math.PR·March 13, 2020

A Bernstein type inequality for sums of selections from three dimensional arrays

Debapratim Banerjee, Matteo Sordello

PDF

TL;DR

This paper establishes Bernstein-type concentration inequalities for sums of selected entries from three-dimensional arrays, extending classical results to more complex, higher-dimensional random permutation-based statistics.

Contribution

The paper introduces Bernstein inequalities for three-dimensional array sums involving permutations, generalizing previous two-dimensional results.

Findings

01

Derived Bernstein inequalities for sums T1 and T2.

02

Extended concentration results from 2D to 3D array settings.

03

Provides tools for analyzing permutation-based sums in higher dimensions.

Abstract

We consider the three dimensional array $A = {a_{i, j, k}}_{1 \leq i, j, k \leq n}$ , with $a_{i, j, k} \in [0, 1]$ , and the two random statistics $T_{1} := \sum_{i = 1}^{n} \sum_{j = 1}^{n} a_{i, j, σ (i)}$ and $T_{2} := \sum_{i = 1}^{n} a_{i, σ (i), π (i)}$ , where $σ$ and $π$ are chosen independently from the set of permutations of ${1, 2, \dots, n} .$ These can be viewed as natural three dimensional generalizations of the statistic $T_{3} = \sum_{i = 1}^{n} a_{i, σ (i)}$ , considered by Hoeffding \cite{Hoe51}. Here we give Bernstein type concentration inequalities for $T_{1}$ and $T_{2}$ by extending the argument for concentration of $T_{3}$ by Chatterjee \cite{Cha05}.

Equations107

T_{1} := i = 1 \sum n j = 1 \sum n a_{i, j, σ (i)} and T_{2} := i = 1 \sum n a_{i, σ (i), π (i)}

T_{1} := i = 1 \sum n j = 1 \sum n a_{i, j, σ (i)} and T_{2} := i = 1 \sum n a_{i, σ (i), π (i)}

T_{3} := i = 1 \sum n a_{i, σ (i)}

T_{3} := i = 1 \sum n a_{i, σ (i)}

P (∣ T_{3} - E (T_{3}) ∣ \geq t) \leq 2 exp {- \frac{t ^{2}}{4 E [ T _{3} ] + 2 t}}

P (∣ T_{3} - E (T_{3}) ∣ \geq t) \leq 2 exp {- \frac{t ^{2}}{4 E [ T _{3} ] + 2 t}}

P (∣ T_{1} - E [T_{1}] ∣ \geq t)

P (∣ T_{1} - E [T_{1}] ∣ \geq t)

P (∣ T_{2} - E [T_{2}] ∣ \geq t)

v(X):=\frac{1}{2}\operatorname{E}\left[|(f(X)-f(X^{\prime}))\cdot F(X,X^{\prime})|\ \big{|}\ X\right].

v(X):=\frac{1}{2}\operatorname{E}\left[|(f(X)-f(X^{\prime}))\cdot F(X,X^{\prime})|\ \big{|}\ X\right].

P (∣ f (X) ∣ \geq t) \leq 2 exp {- \frac{t ^{2}}{2 C + 2 B t}}

P (∣ f (X) ∣ \geq t) \leq 2 exp {- \frac{t ^{2}}{2 C + 2 B t}}

(\sigma^{\prime},\pi^{\prime})=\left\{\begin{array}[]{ll}\left(\sigma\circ\tau_{1}(I_{1},I_{2},I_{3}),\pi\circ\tau_{2}(I_{1},I_{2},I_{3})\right)&\text{with probability }\frac{1}{2}\\[2.0pt] \left(\sigma\circ\tau_{2}(I_{1},I_{2},I_{3}),\pi\circ\tau_{1}(I_{1},I_{2},I_{3})\right)&\text{with probability }\frac{1}{2}\end{array}\right.

(\sigma^{\prime},\pi^{\prime})=\left\{\begin{array}[]{ll}\left(\sigma\circ\tau_{1}(I_{1},I_{2},I_{3}),\pi\circ\tau_{2}(I_{1},I_{2},I_{3})\right)&\text{with probability }\frac{1}{2}\\[2.0pt] \left(\sigma\circ\tau_{2}(I_{1},I_{2},I_{3}),\pi\circ\tau_{1}(I_{1},I_{2},I_{3})\right)&\text{with probability }\frac{1}{2}\end{array}\right.

T_{2}^{'} := i \sum a_{i, σ^{'} (i), π^{'} (i)} .

T_{2}^{'} := i \sum a_{i, σ^{'} (i), π^{'} (i)} .

T_{1}^{'} := i = 1 \sum n j = 1 \sum n a_{i, j, σ^{'} (i)} = T_{1} + j = 1 \sum n (a_{I_{1}, j, σ (I_{2})} + a_{I_{2}, j, σ (I_{1})} - a_{I_{1}, j, σ (I_{1})} - a_{I_{2}, j, σ (I_{2})})

T_{1}^{'} := i = 1 \sum n j = 1 \sum n a_{i, j, σ^{'} (i)} = T_{1} + j = 1 \sum n (a_{I_{1}, j, σ (I_{2})} + a_{I_{2}, j, σ (I_{1})} - a_{I_{1}, j, σ (I_{1})} - a_{I_{2}, j, σ (I_{2})})

P (T_{1} = x, T_{1}^{'} = x^{'} ∣ σ, I_{1}, I_{2}) = 1 (i = 1 \sum n j = 1 \sum n a_{i, j, σ (i)} = x, i = 1 \sum n j = 1 \sum n a_{i, j, σ \circ (I_{1}, I_{2}) (i)} = x^{'})

P (T_{1} = x, T_{1}^{'} = x^{'} ∣ σ, I_{1}, I_{2}) = 1 (i = 1 \sum n j = 1 \sum n a_{i, j, σ (i)} = x, i = 1 \sum n j = 1 \sum n a_{i, j, σ \circ (I_{1}, I_{2}) (i)} = x^{'})

P (T_{1} = x, T_{1}^{'} = x^{'})

P (T_{1} = x, T_{1}^{'} = x^{'})

= E γ_{2} \in S_{n} \sum \frac{1}{n !} \cdot 1 (i = 1 \sum n j = 1 \sum n a_{i, j, γ_{2} \circ (I_{1}, I_{2}) (i)} = x, i = 1 \sum n j = 1 \sum n a_{i, j, γ_{2} (i)} = x^{'})

= P (T_{1}^{'} = x, T_{1} = x^{'})

E [T_{1} - T_{1}^{'} ∣ T_{1}]

E [T_{1} - T_{1}^{'} ∣ T_{1}]

= j = 1 \sum n \frac{1}{n ^{2}} i = 1 \sum n k = 1 \sum n E [a_{i, j, σ (i)} + a_{k, j, σ (k)} - a_{i, j, σ (k)} - a_{k, j, σ (i)} ∣ T_{1}]

= \frac{2}{n} T_{1} - \frac{2}{n} i = 1 \sum n j = 1 \sum n \frac{a _{i, j, 1} + ... + a _{i, j, n}}{n} = \frac{2}{n} (T_{1} - E [T_{1}]) .

v (T_{1})

v (T_{1})

\displaystyle\leq\frac{n^{2}}{2}\operatorname{E}\left[\sum_{j=1}^{n}(a_{I_{1},j,\sigma(I_{1})}+a_{I_{2},j,\sigma(I_{2})}+a_{I_{1},j,\sigma(I_{2})}+a_{I_{2},j,\sigma(I_{1})})\bigg{|}\ T_{1}\right]

\displaystyle=\frac{n^{2}}{2}\cdot\frac{1}{n^{2}}\sum_{i=1}^{n}\sum_{k=1}^{n}\operatorname{E}\left[\sum_{j=1}^{n}\left(a_{i,j,\sigma(i)}+a_{k,j,\sigma(k)}+a_{i,j,\sigma(k)}+a_{k,j,\sigma(i)}\right)\bigg{|}\ T_{1}\right]

= n \cdot T_{1} + n i = 1 \sum n j = 1 \sum n \frac{a _{i, j, 1} + ... + a _{i, j, n}}{n} = n (T_{1} + E [T_{1}]) = n (f (T_{1}) + 2 \cdot E [T_{1}]) .

P (∣ T_{1} - E [T_{1}] ∣ \geq t) \leq 2 exp {- \frac{t ^{2}}{2 n \cdot ( 2 E [ T _{1} ] + t )}} .

P (∣ T_{1} - E [T_{1}] ∣ \geq t) \leq 2 exp {- \frac{t ^{2}}{2 n \cdot ( 2 E [ T _{1} ] + t )}} .

P (∣ T_{1} - E [T_{1}] ∣ \geq n^{1 + λ}) \leq 2 exp {- \frac{n ^{2 + 2 λ}}{2 n \cdot ( 2 E [ T _{1} ] + n ^{1 + λ} )}} \approx 2 exp {- \frac{n ^{λ}}{2}}

P (∣ T_{1} - E [T_{1}] ∣ \geq n^{1 + λ}) \leq 2 exp {- \frac{n ^{2 + 2 λ}}{2 n \cdot ( 2 E [ T _{1} ] + n ^{1 + λ} )}} \approx 2 exp {- \frac{n ^{λ}}{2}}

T_{2}^{'}

T_{2}^{'}

\displaystyle=T_{2}-\sum_{j=1}^{3}a_{I_{j},\sigma(I_{j}),\pi(I_{j})}+\left\{\begin{array}[]{cc}\sum_{j=1}^{3}a_{I_{j},\sigma\circ\tau_{1}(I_{1},I_{2},I_{3})(I_{j}),\pi\circ\tau_{2}(I_{1},I_{2},I_{3})(I_{j})}&\text{with prob. }\frac{1}{2}\\[4.0pt] \sum_{j=1}^{3}a_{I_{j},\sigma\circ\tau_{2}(I_{1},I_{2},I_{3})(I_{j}),\pi\circ\tau_{1}(I_{1},I_{2},I_{3})(I_{j})}&\text{with prob. }\frac{1}{2}.\end{array}\right.

P (T_{2} = x, T_{2}^{'} = x^{'} ∣ σ, π, σ^{'}, π^{'}) = 1 (i \sum a_{i, σ (i), π (i)} = x, i \sum a_{i, σ^{'} (i), π^{'} (i)} = x^{'}) .

P (T_{2} = x, T_{2}^{'} = x^{'} ∣ σ, π, σ^{'}, π^{'}) = 1 (i \sum a_{i, σ (i), π (i)} = x, i \sum a_{i, σ^{'} (i), π^{'} (i)} = x^{'}) .

P (T_{2} = x, T_{2}^{'} = x^{'})

P (T_{2} = x, T_{2}^{'} = x^{'})

= E γ_{1}, γ_{2} \in S_{n}^{2} \sum \frac{1}{( n ! ) ^{2}} {\frac{1}{2} \cdot 1 (i \sum a_{i, γ_{1} (i), γ_{2} (i)} = x, i \sum a_{i, γ_{1} \circ τ_{1} (I_{1}, I_{2}, I_{3}) (i), γ_{2} \circ τ_{2} (I_{1}, I_{2}, I_{3}) (i)} = x^{'})

+ \frac{1}{2} \cdot 1 (i \sum a_{i, γ_{1} (i), γ_{2} (i)} = x, i \sum a_{i, γ_{1} \circ τ_{2} (I_{1}, I_{2}, I_{3}) (i), γ_{2} \circ τ_{1} (I_{1}, I_{2}, I_{3}) (i)} = x^{'})}]

= E γ_{3}, γ_{4} \in S_{n}^{2} \sum \frac{1}{2 ( n ! ) ^{2}} \cdot 1 (i \sum a_{i, γ_{3} \circ τ_{2} (I_{1}, I_{2}, I_{3}) (i), γ_{4} \circ τ_{1} (I_{1}, I_{2}, I_{3}) (i)} = x, i \sum a_{i, γ_{3} (i), γ_{4} (i)} = x^{'})

+ γ_{5}, γ_{6} \in S_{n}^{2} \sum \frac{1}{2 ( n ! ) ^{2}} \cdot 1 (i \sum a_{i, γ_{5} \circ τ_{1} (I_{1}, I_{2}, I_{3}) (i), γ_{6} \circ τ_{2} (I_{1}, I_{2}, I_{3}) (i)} = x, i \sum a_{i, γ_{5} (i), γ_{6} (i)} = x^{'})

= P (T_{2} = x^{'}, T_{2}^{'} = x) .

\displaystyle{}\operatorname{E}\left[T_{2}^{\prime}\big{|}\sigma,\pi\right]

\displaystyle{}\operatorname{E}\left[T_{2}^{\prime}\big{|}\sigma,\pi\right]

\displaystyle+\left.\frac{1}{2}\left(\sum_{j=1}^{3}a_{I_{j},\sigma\circ\tau_{2}(I_{1},I_{2},I_{3})(I_{j}),\pi\circ\tau_{1}(I_{1},I_{2},I_{3})(I_{j})}\right)\big{|}\sigma,\pi\right].

\operatorname{E}\left[\operatorname{E}\left[\sum_{j=1}^{3}a_{I_{j},\sigma(I_{j}),\pi(I_{j})}\left|\sigma,\pi\right.\right]\big{|}T_{2}\right]=\operatorname{E}\left[\sum_{j=1}^{3}\frac{1}{n}\sum_{i=1}^{n}a_{i,\sigma(i),\pi(i)}\big{|}T_{2}\right]=\frac{3}{n}T_{2}.

\operatorname{E}\left[\operatorname{E}\left[\sum_{j=1}^{3}a_{I_{j},\sigma(I_{j}),\pi(I_{j})}\left|\sigma,\pi\right.\right]\big{|}T_{2}\right]=\operatorname{E}\left[\sum_{j=1}^{3}\frac{1}{n}\sum_{i=1}^{n}a_{i,\sigma(i),\pi(i)}\big{|}T_{2}\right]=\frac{3}{n}T_{2}.

E [a_{I_{j}, σ \circ τ_{1} (I_{1}, I_{2}, I_{3}) (I_{j}), π \circ τ_{2} (I_{1}, I_{2}, I_{3}) (I_{j})} ∣ σ, π] = E [a_{I_{j}, σ \circ τ_{2} (I_{1}, I_{2}, I_{3}) (I_{j}), π \circ τ_{1} (I_{1}, I_{2}, I_{3}) (I_{j})} ∣ σ, π]

E [a_{I_{j}, σ \circ τ_{1} (I_{1}, I_{2}, I_{3}) (I_{j}), π \circ τ_{2} (I_{1}, I_{2}, I_{3}) (I_{j})} ∣ σ, π] = E [a_{I_{j}, σ \circ τ_{2} (I_{1}, I_{2}, I_{3}) (I_{j}), π \circ τ_{1} (I_{1}, I_{2}, I_{3}) (I_{j})} ∣ σ, π]

= \frac{1}{n ( n - 1 ) ( n - 2 )} (i, j, k) \in C_{3} \sum a_{i, σ (j), π (k)}

\displaystyle\operatorname{E}\left[\frac{1}{2}\left(\sum_{j=1}^{3}a_{I_{j},\sigma\circ\tau_{1}(I_{1},I_{2},I_{3})(I_{j}),\pi\circ\tau_{2}(I_{1},I_{2},I_{3})(I_{j})}\right)+\frac{1}{2}\left(\sum_{j=1}^{3}a_{I_{j},\sigma\circ\tau_{2}(I_{1},I_{2},I_{3})(I_{j}),\pi\circ\tau_{1}(I_{1},I_{2},I_{3})(I_{j})}\right)\big{|}\sigma,\pi\right]

\displaystyle\operatorname{E}\left[\frac{1}{2}\left(\sum_{j=1}^{3}a_{I_{j},\sigma\circ\tau_{1}(I_{1},I_{2},I_{3})(I_{j}),\pi\circ\tau_{2}(I_{1},I_{2},I_{3})(I_{j})}\right)+\frac{1}{2}\left(\sum_{j=1}^{3}a_{I_{j},\sigma\circ\tau_{2}(I_{1},I_{2},I_{3})(I_{j}),\pi\circ\tau_{1}(I_{1},I_{2},I_{3})(I_{j})}\right)\big{|}\sigma,\pi\right]

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

A Bernstein type inequality for sums of selections from three dimensional arrays

Debapratim Banerjee 111Email:[email protected] and Matteo Sordello 222Email: [email protected]

Dept. of Statistics, University of Pennsylvania

Abstract

We consider the three dimensional array $\mathcal{A}=\{a_{i,j,k}\}_{1\leq i,j,k\leq n}$ , with $a_{i,j,k}\in[0,1]$ , and the two random statistics $T_{1}:=\sum_{i=1}^{n}\sum_{j=1}^{n}a_{i,j,\sigma(i)}$ and $T_{2}:=\sum_{i=1}^{n}a_{i,\sigma(i),\pi(i)}$ , where $\sigma$ and $\pi$ are chosen independently from the set of permutations of $\{1,2,\ldots,n\}.$ These can be viewed as natural three dimensional generalizations of the statistic $T_{3}=\sum_{i=1}^{n}a_{i,\sigma(i)}$ , considered by Hoeffding [3]. Here we give Bernstein type concentration inequalities for $T_{1}$ and $T_{2}$ by extending the argument for concentration of $T_{3}$ by Chatterjee [1].

1 Arrays and Concentration Inequalities

Let $\mathcal{A}=\{a_{i,j,k}\}_{1\leq i,j,k\leq n}$ be a three dimensional array with $a_{i,j,k}\in[0,1]$ , and consider the following two statistics

[TABLE]

where $\sigma$ and $\pi$ are chosen independently and uniformly from the set $S_{n}$ of permutations of $[n]=\{1,2,\ldots,n\}$ . Our goal is to obtain Bernstein type tail bounds for the statistics $T_{1}$ and $T_{2}$ . Statistics of these type have already been considered in literature; for example, when the dimension is two, the statistic

[TABLE]

where $\sigma$ is drawn uniformly from $S_{n}$ was studied by Hoeffding [3], who proved that, under certain conditions, it has an asymptotic normal distribution as $n$ goes to infinity. In fact, the special case when $a_{i,j}=c_{i}\cdot d_{j}$ dates back to the works of Wald and Wolfwitz [8] and Noether [6]. Another example of the statistic $T_{3}$ is the Spearman’s footrule, useful in non-parametric statistics, where $a_{i,j}=|i-j|.$ Statistics $T_{1}$ and $T_{2}$ can be viewed as natural generalizations of statistic $T_{3}$ in three dimensions. However, in this paper we are concerned about concentration inequalities for $T_{1}$ and $T_{2}$ , and not on their asymptotic distribution. The concentration of $T_{3}$ was considered by Chatterjee [1] (page 52); specifically he obtained an elegant tail bound of Bernstein type.

Theorem 1.1.

Let $\{a_{i,j}\}_{1\leq i,j,\leq n}\in[0,1]$ and $T_{3}$ be as above. Then for any $t\geq 0,$

[TABLE]

Chatterjee obtains this bound by the method of exchangeable pairs, and here we extend his method to obtain Bernstein type concentration inequalities for $T_{1}$ and $T_{2}$ .

Theorem 1.2.

If $T_{1}$ and $T_{2}$ are as defined in (1.1) and $\{a_{i,j,k}\}_{1\leq i,j,k\leq n}\in[0,1]$ , then

[TABLE]

Concentrations of functions of random permutations have also been studied by Talagrand (Theorem 5.1) [7], Murray [4] and McDiarmid [5]. However, as mentioned in Chatterjee [1], apart from Talagrand’s Theorem 5.1 none of these results are able to give Bernstein type concentration inequalities as above.

2 On the method of exchangeable pair

We first need to recall some notions on the theory of exchangeable pairs as used by Chatterjee [1].

Definition 2.1.

Suppose $X$ is a random variable on the measure space $(\Omega,\mathcal{F},\mathbb{P})$ and $X^{\prime}$ is another random variable defined on the same measure space. The pair $(X,X^{\prime})$ is called an exchangeable pair if $(X,X^{\prime})\stackrel{{\scriptstyle d}}{{=}}(X^{\prime},X)$ .

The method of exchangeable pairs exploits three useful functions:

•

A function $F:\mathbb{R}^{2}\to\mathbb{R}$ , measurable and almost surely anti-symmetric, i.e. such that $F(X,X^{\prime})=-F(X^{\prime},X)$ almost surely.

•

The function $f:\mathbb{R}\to\mathbb{R}$ defined by $f(X):=\operatorname{E}\left[F(X,X^{\prime})\left|X\right.\right]$ . This is a fundamental quantity in the the concentration inequality.

•

The function $v(X)$ , that serves as a stochastic bound size of $f(X)$ , and which is defined by

[TABLE]

The following lemma from Chatterjee [1] tells us how the concentration of $f(X)$ is governed by a bound on $v(X)$ .

Lemma 2.1.

(Theorem 3.9 in [1]) Suppose $(X,X^{\prime})$ is an exchangeable pair and $F(X,X^{\prime})$ , $f(X)$ and $v(X)$ are defined as before, with $v(X)\leq C+Bf(X)$ almost surely for some known fixed constants $B$ and $C$ . Then

[TABLE]

The fundamental idea of the method of exchangeable pairs is to construct $F(X,X^{\prime})$ , $f(X)$ and $v(X)$ so that Lemma 2.1 yields concentration for $f(X)$ . One example to keep in mind of $F(X,X^{\prime})$ is $c(X-X^{\prime})$ , where $c$ is a nonrandom constant.

Remark 2.1.

Chatterjee ([2], Theorem 1.5) has further proved under the hypotheses of Lemma 2.1 that for the lower tail probabilities $\mathbb{P}(f(X)\leq-t)$ one has a genuinely Gaussian bound of the form $\exp\left(-\frac{t^{2}}{2C}\right)$ . One can also show by a further modification of Chatterjee’s method that there is a Gaussian bound for the lower tail probabilities of $T_{2}$ and $T_{3}$ , but we do not pursue these bounds here.

3 Strategy of the Proofs

For proving the concentration inequalities for $T_{1}$ and $T_{2}$ , we use the following general strategy. At first we construct the statistics $T_{1}^{\prime}$ and $T_{2}^{\prime}$ by applying “small” changes to $T_{1}$ and $T_{2}$ , such that two properties hold. We require $(T_{j},T_{j}^{\prime})$ to form an exchangeable pair and $\operatorname{E}\left[T_{j}-T_{j}^{\prime}\left|T_{j}\right.\right]$ to be somewhat close to $c\left(T_{j}-\operatorname{E}[T_{j}]\right)/n$ for each $j\in\{1,2\}$ . We then define the quantity $v(T_{j})$ as in the previous section and bound it in terms of $T_{j}-\operatorname{E}[T_{j}]$ . Finally, we derive the concentration inequality for $\left|T_{j}-\operatorname{E}[T_{j}]\right|$ by applying Lemma 2.1.

The construction of $T_{1}^{\prime}$ is done by choosing two indexes $I_{1},I_{2}$ independently and uniformly at random from $[n]$ , and considering the permutation $(I_{1},I_{2})$ that interchanges the two indexes. We then define the permutation $\sigma^{\prime}=\sigma\circ(I_{1},I_{2})$ and the statistic $T_{1}^{\prime}:=\sum_{i=1}^{n}\sum_{j=1}^{n}a_{i,j,\sigma^{\prime}(i)}$ , and we prove that $\operatorname{E}\left[T_{1}-T_{1}^{\prime}\left|T_{1}\right.\right]=2\left(T_{1}-\operatorname{E}[T_{1}]\right)/n$ . This is a similar procedure to the one in Chatterjee [1].

The construction of $T_{2}^{\prime}$ is not as simple. The main reason is that $\operatorname{E}[T_{2}]$ is a sum over three independent directions $i,j,k$ , while, fixing $\sigma$ and $\pi$ , $T_{2}$ is a sum over only one single direction. As a consequence, one might check that it is not possible to get $\operatorname{E}\left[T_{2}-T_{2}^{\prime}\left|T_{2}\right.\right]$ close to $c\left(T_{2}-\operatorname{E}[T_{2}]\right)/n$ by simply moving two indexes. Instead, one needs to move three indexes in a systematic way. We then choose $(I_{1},I_{2},I_{3})$ extracted uniformly without replacement from $[n]$ and define the functions $\tau_{1,2}:[n]^{3}\to\mathcal{S}_{n}$ such that $\tau_{1}(I_{1},I_{2},I_{3})=(I_{1},I_{2},I_{3})$ and $\tau_{2}(I_{1},I_{2},I_{3})=(I_{1},I_{3},I_{2})$ . These are the only cyclic permutations which are not the identity. The permutations $\sigma^{\prime},\pi^{\prime}$ are defined as follows:

[TABLE]

Note that this $\sigma^{\prime}$ is different from the one defined in the construction of $T_{1}^{\prime}$ , but it will always be clear which one of the two we are considering. Finally we define

[TABLE]

For $\sigma^{\prime}$ and $\pi^{\prime}$ to be valid permutations one needs all the indexes $(I_{1},I_{2},I_{3})$ to be distinct or for all three to be the same. We only consider the case when $(I_{1},I_{2},I_{3})$ are all distinct for convenience, since the case when they are all the same does not affect the exchangeability of $T_{2}$ and $T_{2}^{\prime}$ and it just gives a slight change in the result which is negligible as $n$ grows to infinity. It is important to note that one needs $\sigma^{\prime}$ and $\pi^{\prime}$ to be valid permutations in order for $T_{2}^{\prime}$ to have the same distribution as $T_{2}$ , necessary condition to have exchangeability.

4 Proofs of the results

To prove (1.2), we exchange two pairs of one dimensional rows (i.e. with $n$ elements each) in the $j^{th}$ direction of the matrix $\mathcal{A}$ , and get

[TABLE]

where $\sigma^{\prime}=\sigma\circ(I_{1},I_{2})$ and $I_{1},I_{2}$ are extracted independently and uniformly at random from $[n]$ .

Proposition 4.1.

$(T_{1},T_{1}^{\prime})$ * is an exchangeable pair.*

Proof.

We have the identity $\mathbb{P}\left(T_{1}=x,T_{1}^{\prime}=x^{\prime}\right)=\operatorname{E}\left[\mathbb{P}\left(T_{1}=x,T_{1}^{\prime}=x^{\prime}|\sigma,I_{1},I_{2}\right)\right]$ and, since $(T_{1},T_{1}^{\prime})$ is a deterministic function of $\sigma,I_{1}$ and $I_{2}$ , we can write

[TABLE]

where as usual $\mathbf{1}(\cdot)$ is the indicator function of the event in brackets. One then has

[TABLE]

where we just set $\gamma_{2}=\gamma_{1}\circ(I_{1},I_{2})$ . Moreover, for each fixed pair $(I_{1},I_{2})$ , summing over all possible $\gamma_{1}$ is equivalent to summing over all $\gamma_{2}$ , since both sums are made over the whole $\mathcal{S}_{n}$ . ∎

When we take the expectation of $T_{1}-T_{1}^{\prime}$ with respect to $\sigma$ , $I_{1}$ and $I_{2}$ , conditional on $T_{1}$ , we have

[TABLE]

We define $F(T_{1},T_{1}^{\prime})=\frac{n}{2}(T_{1}-T_{1}^{\prime})$ and $f(T_{1}):=\operatorname{E}\left[F(T_{1},T_{1}^{\prime})\ |\ T_{1}\right]=T_{1}-\operatorname{E}[T_{1}]$ , then the stochastic bound $v(T_{1})$ satisfies

[TABLE]

Here we used the observation that if $\alpha$ and $\beta$ are non-negative numbers bounded by D, then $(\alpha-\beta)^{2}\leq D\cdot(\alpha+\beta)$ . The hypothesis of Lemma 2.1 then hold with $B=n$ and $C=2n\operatorname{E}[T_{1}]$ , and Lemma 2.1 gives us

[TABLE]

$\square$

Remark 4.1.

As $n$ increases, this bound gets weaker. If we consider $t$ increasing faster than $n$ , for example $t=n^{1+\lambda}$ , with $\lambda>0$ , we get

[TABLE]

as $n$ grows.

Now to prove (1.3), we need a more delicate argument. For $k\in\{1,2,3\}$ we define $C_{k}$ to be the set of all ordered tuples $(I_{1},I_{2},I_{3})\in[n]^{3}$ such that $\#\{I_{1},I_{2},I_{3}\}=k$ . As already mentioned before, we need $(I_{1},I_{2},I_{3})\in\left\{C_{1},C_{3}\right\}$ in order for $\tau_{1}(I_{1},I_{2},I_{3})$ and $\tau_{2}(I_{1},I_{2},I_{3})$ to be valid permutations. It is easy to see that in that case we have $\tau_{1}(I_{1},I_{2},I_{3})^{-1}=\tau_{2}(I_{1},I_{2},I_{3})$ . On the contrary, when $(I_{1},I_{2},I_{3})\in C_{2}$ , the permutations $\sigma^{\prime}$ and $\pi^{\prime}$ are not well-defined. With the choice of $(I_{1},I_{2},I_{3})\in C_{3}$ , and $(\sigma^{\prime},\pi^{\prime})$ defined as before, we have

[TABLE]

Proposition 4.2.

$(T_{2},T_{2}^{\prime})$ * forms an exchangeable pair.*

Proof.

We start on the same lines of Proposition 4.1. We first write the equation $\mathbb{P}\left(T_{2}=x,T_{2}^{\prime}=x^{\prime}\right)=\operatorname{E}\left[\mathbb{P}\left(T_{2}=x,T_{2}^{\prime}=x^{\prime}\left|\sigma,\pi,\sigma^{\prime},\pi^{\prime}\right.\right)\right]$ and, since $(T_{2},T_{2}^{\prime})$ is a function of $(\sigma,\pi,\sigma^{\prime},\pi^{\prime})$ , we have

[TABLE]

We set $\gamma_{3}=\gamma_{1}\circ\tau_{1}(I_{1},I_{2},I_{3}),\ \gamma_{4}=\gamma_{2}\circ\tau_{2}(I_{1},I_{2},I_{3}),\ \gamma_{5}=\gamma_{1}\circ\tau_{2}(I_{1},I_{2},I_{3})$ and $\gamma_{6}=\gamma_{2}\circ\tau_{1}(I_{1},I_{2},I_{3})$ , to get

[TABLE]

We have used the facts that $\tau_{1}(I_{1},I_{2},I_{3})^{-1}=\tau_{2}(I_{1},I_{2},I_{3})$ and that for each tuple $(I_{1},I_{2},I_{3})$ it is equivalent to sum over all the pairs $(\gamma_{1},\gamma_{2}),(\gamma_{3},\gamma_{4})$ or $(\gamma_{5},\gamma_{6})$ . ∎

The next goal is to find an expression for $\operatorname{E}\left[T_{2}-T_{2}^{\prime}|T_{2}\right]$ that allows us to define $F(T_{2},T_{2}^{\prime})$ and $f(T_{2})$ . First observe that $\operatorname{E}\left[T_{2}^{\prime}\left|T_{2}\right.\right]=\operatorname{E}\left[\operatorname{E}\left[T_{2}^{\prime}\left|\sigma,\pi\right.\right]\left|T_{2}\right.\right]$ , and that one also has

[TABLE]

We deal separately with the terms in the last expression. First of all,

[TABLE]

Now, for any $j\in\{1,2,3\}$ ,

[TABLE]

which implies that

[TABLE]

We define $Y(\sigma,\pi):=\sum_{(i,j,k)\in C_{2}}a_{i,\sigma(j),\pi(k)}$ , and notice that $0\leq Y(\sigma,\pi)\leq 3n(n-1)$ irrespective of $\sigma$ and $\pi$ . Putting the previous pieces together, we then get

[TABLE]

Using the expressions above, we have

[TABLE]

and it makes sense now to define

[TABLE]

so that

[TABLE]

From (4.2) one has

[TABLE]

and the following important consequences

[TABLE]

and

[TABLE]

As before we want to upper bound the function $v(T_{2})$ , to finally invoke Lemma 2.1.

[TABLE]

So the last task is now to upper bound these two quantities. With a calculation similar as the one used to obtain (4.1), we get

[TABLE]

and

[TABLE]

Using these estimates in (4.5), together with the lower bound on $f(X)$ obtained in (4.3), we have

[TABLE]

So, making use again of Lemma 2.1, we obtain the concentration inequality

[TABLE]

By (4.3) and (4.4) we obtain a new concentration inequality:

[TABLE]

$\square$

5 Conclusion and Possible Extension

We obtained a Bernstein type tail bounds for the statistics $T_{1}$ and $T_{2}$ . These are three dimensional generalizations of the sum of choices $T_{3}$ , which has already been studied in [1] and [3]. A natural extension of this work consists in finding a concentration inequality for a d-dimensional generalization of the statistic $T_{2}$ , of the form

[TABLE]

where $a_{i_{1},...,i_{d}}\in[0,1]$ and $\pi_{j}\in\mathcal{S}_{n}$ for $j=1,...,d-1$ . Here, one should extract $d$ indexes $(I_{1},...,I_{d})$ without replacement and consider the functions $\tau_{j}:[n]^{d}\to\mathcal{S}_{n}$ for $j=1,...,d-1$ , such that $\tau_{j}(I_{1},...,I_{d})$ is one of the $d-1$ cyclic permutations which are not the identity (for clarity, let it be the rotation of $j$ positions forward, where $I_{1}$ gets mapped into $I_{j+1}$ ). When considering $d$ indexes, it is equivalent to rotate each index by $j$ positions in one direction, or by $d-1-j$ positions in the opposite direction. For this reason one has that $\tau_{j}(I_{1},...,I_{d})^{-1}=\tau_{d-1-j}(I_{1},...,I_{d})$ . We define

[TABLE]

and then proceed as done before, defining $T_{4}^{\prime}$ using the $\pi^{\prime}_{j}$ permutations and showing that $(T_{4},T_{4}^{\prime})$ is an exchangeable pair. To find the tail bound for $T_{4}$ , the calculation is similar as before but more cumbersome, and we have not implemented it in the current paper. The main reason for choosing this type of cyclic permutation is because, for every fixed $l\in[d]$ , the indexes $\left(I_{l},\tau_{1}(I_{1},\ldots,I_{d})(I_{l}),\ldots,\tau_{d}(I_{1},\ldots,I_{d})(I_{l})\right)$ are distinct. Then, when fixing $\pi_{1},\ldots,\pi_{d-1}$ , the expectation

[TABLE]

contains the sum over all the possible independent directions, while leaving out the cases when some indexes are repeated. This is the fundamental observation which allows us to write $\operatorname{E}\left[T_{4}-T_{4}^{\prime}\left|T_{4}\right.\right]$ in the convenient way to apply Lemma 2.1.

6 Acknowledgements

The authors are pleased to thank J. Michael Steele for his encouragement and advice.

Bibliography8

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Chatterjee, S.(2005) Concentration inequalities with exchangeable pairs. Ph.D. thesis. Department of Statistics, Stanford University.
2[2] Chatterjee, S.(2007) Stein’s method for concentration inequalities. Probab. Theory Related Fields. 138(1), 305–321.
3[3] Hoeffding, W. (1951) A combinatorial central limit theorem. Ann. Math. Statist. 22, 558–566.
4[4] Maurey, B. (1979). Construction de suites symetriques. C. R. Acad. Sci. Paris Ser. A-B 288(14), A 679-–A 681.
5[5] Mc Diarmid, C. (2002). Concentration for independent permutations. Combin. Probab. Comput. 11(2), 163–-178.
6[6] Noether, G.E. (1949) On a theorem by Wald and Wolfowitz. Ann. Math. Statist. 20, 455–458.
7[7] Talagrand, M. (1995). Concentration of measure and isoperimetric inequalities in product spaces. Inst. Hautes tudes Sci. Publ. Math. 81 73–-205.
8[8] Wald, A., Wolfowitz, J. (1944) Statistical tests based on permutations of observations. Ann. Math. Statist. 15, 358–372.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

A Bernstein type inequality for sums of selections from three dimensional arrays

Abstract

1 Arrays and Concentration Inequalities

Theorem 1.1**.**

Theorem 1.2**.**

2 On the method of exchangeable pair

Definition 2.1**.**

Lemma 2.1**.**

Remark 2.1**.**

3 Strategy of the Proofs

4 Proofs of the results

Proposition 4.1**.**

Proof.

Remark 4.1**.**

Proposition 4.2**.**

Proof.

5 Conclusion and Possible Extension

6 Acknowledgements

Theorem 1.1.

Theorem 1.2.

Definition 2.1.

Lemma 2.1.

Remark 2.1.

Proposition 4.1.

Remark 4.1.

Proposition 4.2.