A Bernstein type inequality for sums of selections from three dimensional arrays
Debapratim Banerjee, Matteo Sordello

TL;DR
This paper establishes Bernstein-type concentration inequalities for sums of selected entries from three-dimensional arrays, extending classical results to more complex, higher-dimensional random permutation-based statistics.
Contribution
The paper introduces Bernstein inequalities for three-dimensional array sums involving permutations, generalizing previous two-dimensional results.
Findings
Derived Bernstein inequalities for sums T1 and T2.
Extended concentration results from 2D to 3D array settings.
Provides tools for analyzing permutation-based sums in higher dimensions.
Abstract
We consider the three dimensional array , with , and the two random statistics and , where and are chosen independently from the set of permutations of These can be viewed as natural three dimensional generalizations of the statistic , considered by Hoeffding \cite{Hoe51}. Here we give Bernstein type concentration inequalities for and by extending the argument for concentration of by Chatterjee \cite{Cha05}.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A Bernstein type inequality for sums of selections from three dimensional arrays
Debapratim Banerjee 111Email:[email protected] and Matteo Sordello 222Email: [email protected]
Dept. of Statistics, University of Pennsylvania
Abstract
We consider the three dimensional array , with , and the two random statistics and , where and are chosen independently from the set of permutations of These can be viewed as natural three dimensional generalizations of the statistic , considered by Hoeffding [3]. Here we give Bernstein type concentration inequalities for and by extending the argument for concentration of by Chatterjee [1].
1 Arrays and Concentration Inequalities
Let be a three dimensional array with , and consider the following two statistics
[TABLE]
where and are chosen independently and uniformly from the set of permutations of . Our goal is to obtain Bernstein type tail bounds for the statistics and . Statistics of these type have already been considered in literature; for example, when the dimension is two, the statistic
[TABLE]
where is drawn uniformly from was studied by Hoeffding [3], who proved that, under certain conditions, it has an asymptotic normal distribution as goes to infinity. In fact, the special case when dates back to the works of Wald and Wolfwitz [8] and Noether [6]. Another example of the statistic is the Spearman’s footrule, useful in non-parametric statistics, where Statistics and can be viewed as natural generalizations of statistic in three dimensions. However, in this paper we are concerned about concentration inequalities for and , and not on their asymptotic distribution. The concentration of was considered by Chatterjee [1] (page 52); specifically he obtained an elegant tail bound of Bernstein type.
Theorem 1.1**.**
Let and be as above. Then for any
[TABLE]
Chatterjee obtains this bound by the method of exchangeable pairs, and here we extend his method to obtain Bernstein type concentration inequalities for and .
Theorem 1.2**.**
If and are as defined in (1.1) and , then
[TABLE]
Concentrations of functions of random permutations have also been studied by Talagrand (Theorem 5.1) [7], Murray [4] and McDiarmid [5]. However, as mentioned in Chatterjee [1], apart from Talagrand’s Theorem 5.1 none of these results are able to give Bernstein type concentration inequalities as above.
2 On the method of exchangeable pair
We first need to recall some notions on the theory of exchangeable pairs as used by Chatterjee [1].
Definition 2.1**.**
Suppose is a random variable on the measure space and is another random variable defined on the same measure space. The pair is called an exchangeable pair if .
The method of exchangeable pairs exploits three useful functions:
- •
A function , measurable and almost surely anti-symmetric, i.e. such that almost surely.
- •
The function defined by . This is a fundamental quantity in the the concentration inequality.
- •
The function , that serves as a stochastic bound size of , and which is defined by
[TABLE]
The following lemma from Chatterjee [1] tells us how the concentration of is governed by a bound on .
Lemma 2.1**.**
(Theorem 3.9 in [1]) Suppose is an exchangeable pair and , and are defined as before, with almost surely for some known fixed constants and . Then
[TABLE]
The fundamental idea of the method of exchangeable pairs is to construct , and so that Lemma 2.1 yields concentration for . One example to keep in mind of is , where is a nonrandom constant.
Remark 2.1**.**
Chatterjee ([2], Theorem 1.5) has further proved under the hypotheses of Lemma 2.1 that for the lower tail probabilities one has a genuinely Gaussian bound of the form . One can also show by a further modification of Chatterjee’s method that there is a Gaussian bound for the lower tail probabilities of and , but we do not pursue these bounds here.
3 Strategy of the Proofs
For proving the concentration inequalities for and , we use the following general strategy. At first we construct the statistics and by applying “small” changes to and , such that two properties hold. We require to form an exchangeable pair and to be somewhat close to for each . We then define the quantity as in the previous section and bound it in terms of . Finally, we derive the concentration inequality for by applying Lemma 2.1.
The construction of is done by choosing two indexes independently and uniformly at random from , and considering the permutation that interchanges the two indexes. We then define the permutation and the statistic , and we prove that . This is a similar procedure to the one in Chatterjee [1].
The construction of is not as simple. The main reason is that is a sum over three independent directions , while, fixing and , is a sum over only one single direction. As a consequence, one might check that it is not possible to get close to by simply moving two indexes. Instead, one needs to move three indexes in a systematic way. We then choose extracted uniformly without replacement from and define the functions such that and . These are the only cyclic permutations which are not the identity. The permutations are defined as follows:
[TABLE]
Note that this is different from the one defined in the construction of , but it will always be clear which one of the two we are considering. Finally we define
[TABLE]
For and to be valid permutations one needs all the indexes to be distinct or for all three to be the same. We only consider the case when are all distinct for convenience, since the case when they are all the same does not affect the exchangeability of and and it just gives a slight change in the result which is negligible as grows to infinity. It is important to note that one needs and to be valid permutations in order for to have the same distribution as , necessary condition to have exchangeability.
4 Proofs of the results
To prove (1.2), we exchange two pairs of one dimensional rows (i.e. with elements each) in the direction of the matrix , and get
[TABLE]
where and are extracted independently and uniformly at random from .
Proposition 4.1**.**
* is an exchangeable pair.*
Proof.
We have the identity and, since is a deterministic function of and , we can write
[TABLE]
where as usual is the indicator function of the event in brackets. One then has
[TABLE]
where we just set . Moreover, for each fixed pair , summing over all possible is equivalent to summing over all , since both sums are made over the whole . ∎
When we take the expectation of with respect to , and , conditional on , we have
[TABLE]
We define and , then the stochastic bound satisfies
[TABLE]
Here we used the observation that if and are non-negative numbers bounded by D, then . The hypothesis of Lemma 2.1 then hold with and , and Lemma 2.1 gives us
[TABLE]
Remark 4.1**.**
As increases, this bound gets weaker. If we consider increasing faster than , for example , with , we get
[TABLE]
as grows.
Now to prove (1.3), we need a more delicate argument. For we define to be the set of all ordered tuples such that . As already mentioned before, we need in order for and to be valid permutations. It is easy to see that in that case we have . On the contrary, when , the permutations and are not well-defined. With the choice of , and defined as before, we have
[TABLE]
Proposition 4.2**.**
* forms an exchangeable pair.*
Proof.
We start on the same lines of Proposition 4.1. We first write the equation and, since is a function of , we have
[TABLE]
We set and , to get
[TABLE]
We have used the facts that and that for each tuple it is equivalent to sum over all the pairs or . ∎
The next goal is to find an expression for that allows us to define and . First observe that , and that one also has
[TABLE]
We deal separately with the terms in the last expression. First of all,
[TABLE]
Now, for any ,
[TABLE]
which implies that
[TABLE]
We define , and notice that irrespective of and . Putting the previous pieces together, we then get
[TABLE]
Using the expressions above, we have
[TABLE]
and it makes sense now to define
[TABLE]
so that
[TABLE]
From (4.2) one has
[TABLE]
and the following important consequences
[TABLE]
and
[TABLE]
As before we want to upper bound the function , to finally invoke Lemma 2.1.
[TABLE]
So the last task is now to upper bound these two quantities. With a calculation similar as the one used to obtain (4.1), we get
[TABLE]
and
[TABLE]
Using these estimates in (4.5), together with the lower bound on obtained in (4.3), we have
[TABLE]
So, making use again of Lemma 2.1, we obtain the concentration inequality
[TABLE]
By (4.3) and (4.4) we obtain a new concentration inequality:
[TABLE]
5 Conclusion and Possible Extension
We obtained a Bernstein type tail bounds for the statistics and . These are three dimensional generalizations of the sum of choices , which has already been studied in [1] and [3]. A natural extension of this work consists in finding a concentration inequality for a d-dimensional generalization of the statistic , of the form
[TABLE]
where and for . Here, one should extract indexes without replacement and consider the functions for , such that is one of the cyclic permutations which are not the identity (for clarity, let it be the rotation of positions forward, where gets mapped into ). When considering indexes, it is equivalent to rotate each index by positions in one direction, or by positions in the opposite direction. For this reason one has that . We define
[TABLE]
and then proceed as done before, defining using the permutations and showing that is an exchangeable pair. To find the tail bound for , the calculation is similar as before but more cumbersome, and we have not implemented it in the current paper. The main reason for choosing this type of cyclic permutation is because, for every fixed , the indexes are distinct. Then, when fixing , the expectation
[TABLE]
contains the sum over all the possible independent directions, while leaving out the cases when some indexes are repeated. This is the fundamental observation which allows us to write in the convenient way to apply Lemma 2.1.
6 Acknowledgements
The authors are pleased to thank J. Michael Steele for his encouragement and advice.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Chatterjee, S.(2005) Concentration inequalities with exchangeable pairs. Ph.D. thesis. Department of Statistics, Stanford University.
- 2[2] Chatterjee, S.(2007) Stein’s method for concentration inequalities. Probab. Theory Related Fields. 138(1), 305–321.
- 3[3] Hoeffding, W. (1951) A combinatorial central limit theorem. Ann. Math. Statist. 22, 558–566.
- 4[4] Maurey, B. (1979). Construction de suites symetriques. C. R. Acad. Sci. Paris Ser. A-B 288(14), A 679-–A 681.
- 5[5] Mc Diarmid, C. (2002). Concentration for independent permutations. Combin. Probab. Comput. 11(2), 163–-178.
- 6[6] Noether, G.E. (1949) On a theorem by Wald and Wolfowitz. Ann. Math. Statist. 20, 455–458.
- 7[7] Talagrand, M. (1995). Concentration of measure and isoperimetric inequalities in product spaces. Inst. Hautes tudes Sci. Publ. Math. 81 73–-205.
- 8[8] Wald, A., Wolfowitz, J. (1944) Statistical tests based on permutations of observations. Ann. Math. Statist. 15, 358–372.
