Decoupling Maximal Inequalities

Aryeh Kontorovich

arXiv:2302.14150·math.PR·July 25, 2024

Decoupling Maximal Inequalities

Aryeh Kontorovich

PDF

Open Access

TL;DR

This paper investigates how maximal inequalities behave under different dependence structures among non-negative random variables, showing that pairwise independence and certain negative dependencies still allow for effective bounds.

Contribution

It demonstrates that pairwise independence and specific negative dependence conditions are sufficient for maximal inequalities to hold similarly to the independent case.

Findings

01

Pairwise independence suffices for maximal inequalities to behave like the independent case.

02

Negative dependence conditions can also ensure similar maximal inequality bounds.

03

Violations of negative dependence can be tolerated if properly quantified.

Abstract

A {\em maximal inequality} seeks to estimate $E max_{i} X_{i}$ in terms of properties of the $X_{i}$ . When the latter are independent, the union bound (in its various guises) can yield tight upper bounds. If, however, the $X_{i}$ are strongly dependent, the estimates provided by the union bound will be rather loose. In this note, we show that for non-negative random variables, pairwise independence suffices for the maximal inequality to behave comparably to its independent version. The condition of pairwise independence may be relaxed to a kind of negative dependence, and even the latter admits violations -- provided these are properly quantified.

Equations36

E i \in [n] max X_{i} = P (Z > 0), E i \in [n] max \tilde{X}_{i} = P (\tilde{Z} > 0) .

E i \in [n] max X_{i} = P (Z > 0), E i \in [n] max \tilde{X}_{i} = P (\tilde{Z} > 0) .

P (Z > 0) \leq c P (\tilde{Z} > 0) .

P (Z > 0) \leq c P (\tilde{Z} > 0) .

M \leq min {S, 1} \leq c (1 - e^{- S}) .

M \leq min {S, 1} \leq c (1 - e^{- S}) .

\tilde{M} = 1 - i = 1 \prod n (1 - p_{i}) \geq 1 - e^{- S},

\tilde{M} = 1 - i = 1 \prod n (1 - p_{i}) \geq 1 - e^{- S},

E i \in [n] max \tilde{X}_{i} \leq i = 1 \sum n E \tilde{X}_{i} \leq n i \in [n] max E \tilde{X}_{i} = n i \in [n] max E X_{i} \leq n E i \in [n] max X_{i} .

E i \in [n] max \tilde{X}_{i} \leq i = 1 \sum n E \tilde{X}_{i} \leq n i \in [n] max E \tilde{X}_{i} = n i \in [n] max E X_{i} \leq n E i \in [n] max X_{i} .

P (Z > 0) \geq \frac{1}{2} P (\tilde{Z} > 0) .

P (Z > 0) \geq \frac{1}{2} P (\tilde{Z} > 0) .

P (Z > 0) \geq \frac{( E Z ) ^{2}}{E [ Z ^{2} ]} .

P (Z > 0) \geq \frac{( E Z ) ^{2}}{E [ Z ^{2} ]} .

E [Z^{2}] = i = 1 \sum n p_{i} + 2 1 \leq i < j \leq n \sum p_{i} p_{j} = i = 1 \sum n p_{i} + (i = 1 \sum n p_{i})^{2} - i = 1 \sum n p_{i}^{2} \leq i = 1 \sum n p_{i} + (i = 1 \sum n p_{i})^{2} .

E [Z^{2}] = i = 1 \sum n p_{i} + 2 1 \leq i < j \leq n \sum p_{i} p_{j} = i = 1 \sum n p_{i} + (i = 1 \sum n p_{i})^{2} - i = 1 \sum n p_{i}^{2} \leq i = 1 \sum n p_{i} + (i = 1 \sum n p_{i})^{2} .

\frac{( E Z ) ^{2}}{E [ Z ^{2} ]} \geq \frac{( \sum _{i = 1}^{n} p _{i} ) ^{2}}{\sum _{i = 1}^{n} p _{i} + ( \sum _{i = 1}^{n} p _{i} ) ^{2}} .

\frac{( E Z ) ^{2}}{E [ Z ^{2} ]} \geq \frac{( \sum _{i = 1}^{n} p _{i} ) ^{2}}{\sum _{i = 1}^{n} p _{i} + ( \sum _{i = 1}^{n} p _{i} ) ^{2}} .

P (\tilde{Z} > 0) = 1 - i = 1 \prod n (1 - p_{i}) .

P (\tilde{Z} > 0) = 1 - i = 1 \prod n (1 - p_{i}) .

F (p_{1}, \dots, p_{n}) := 2 (i = 1 \sum n p_{i})^{2} - i = 1 \sum n p_{i} + (i = 1 \sum n p_{i})^{2} (1 - i = 1 \prod n (1 - p_{i})) \geq 0.

F (p_{1}, \dots, p_{n}) := 2 (i = 1 \sum n p_{i})^{2} - i = 1 \sum n p_{i} + (i = 1 \sum n p_{i})^{2} (1 - i = 1 \prod n (1 - p_{i})) \geq 0.

E i \in [n] max X_{i} \leq c E i \in [n] max \tilde{X}_{i} .

E i \in [n] max X_{i} \leq c E i \in [n] max \tilde{X}_{i} .

E i \in [n] max X_{i}

E i \in [n] max X_{i}

E i \in [n] max X_{i} \geq \frac{1}{2} E i \in [n] max \tilde{X}_{i} .

E i \in [n] max X_{i} \geq \frac{1}{2} E i \in [n] max \tilde{X}_{i} .

η_{ij} = (E [X_{i} X_{j}] - p_{i} p_{j})_{+}

η_{ij} = (E [X_{i} X_{j}] - p_{i} p_{j})_{+}

E [Z^{2}] \leq i = 1 \sum n p_{i} + (i = 1 \sum n p_{i})^{2} + i, j \in [n] \sum η_{ij} .

E [Z^{2}] \leq i = 1 \sum n p_{i} + (i = 1 \sum n p_{i})^{2} + i, j \in [n] \sum η_{ij} .

\frac{A}{B} \geq C

\frac{A}{B} \geq C

P (Z > 0) \geq \frac{1}{2} (1 - \frac{H}{B + H}) P (\tilde{Z} > 0) .

P (Z > 0) \geq \frac{1}{2} (1 - \frac{H}{B + H}) P (\tilde{Z} > 0) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsProbability and Risk Models · Risk and Portfolio Optimization

Full text

Decoupling Maximal Inequalities

Aryeh Kontorovich

[email protected]

Abstract

A maximal inequality seeks to estimate $\mathop{\mathbb{E}}\max_{i}X_{i}$ in terms of properties of the $X_{i}$ . When the latter are independent, the union bound (in its various guises) can yield tight upper bounds. If, however, the $X_{i}$ are strongly dependent, the estimates provided by the union bound will be rather loose. In this note, we show that for non-negative random variables, pairwise independence suffices for the maximal inequality to behave comparably to its independent version. The condition of pairwise independence may be relaxed to a kind of negative dependence, and even the latter admits violations — provided these are properly quantified.

1 Motivation

Maximal inequalities are at the heart of empirical process theory (van Handel, 2014). The case of Gaussian processes is well-understood via the celebrated generic chaining technique (Talagrand, 2016). There, a key role in the lower bounds is played Slepian’s inequality, which allows one to approximate a Gaussian process by an appropriate uncorrelated one. The absence of a generic analog of Slepian’s inequality — say, for the kind of Binomal process considered in Cohen and Kontorovich (2022) — can be a major obstruction in obtaining tight lower bounds. Indeed, as Proposition 3 below shows, for nonnegative $X_{i}$ , any upper bound on $\mathop{\mathbb{E}}\max_{i}\tilde{X}_{i}$ , where $\tilde{X}_{i}$ is the “the independent version” of $X_{i}$ , automatically yields an upper bound on $\mathop{\mathbb{E}}\max_{i}X_{i}$ . The reverse direction, of course, fails without additional structural assumptions. We discover that pairwise independence suffices for the reverse direction, and that this condition can be relaxed further.

2 The Bernoulli case

Let $X_{1},X_{2},\ldots,X_{n}$ and $\tilde{X}_{1},\tilde{X}_{2},\ldots,\tilde{X}_{n}$ be two collections of Bernoulli random variables, where the $\tilde{X}_{i}$ s are mutually independent (and independent of the $X_{i}$ s), with $X_{i},\tilde{X}_{i}\sim\mathrm{Bernoulli}(p_{i})$ . Letting $Z=\sum_{i=1}^{n}X_{i}$ and $\tilde{Z}=\sum_{i=1}^{n}\tilde{X}_{i}$ , we have

[TABLE]

Decoupling from above.

An elegant result of Pinelis (2022) (answering our question) shows that $\mathbb{P}(Z>0)\lesssim\mathbb{P}(\tilde{Z}>0)$ ; his proof provided for completeness:

Proposition 1 (Pinelis).

For $c=\mathrm{e}/(\mathrm{e}-1)$ and $X_{i},\tilde{X}_{i},Z,\tilde{Z},p_{i}$ as above, we have

[TABLE]

Proof.

Put $M=\mathbb{P}(Z>0)$ , $\tilde{M}=\mathbb{P}(\tilde{Z}>0)$ and $S=\sum_{i=1}^{n}p_{i}$ , and observe that

[TABLE]

On the other hand,

[TABLE]

whence $M\leq c\tilde{M}$ . ∎

Further, we note that Pinelis’s constant $c=\mathrm{e}/(\mathrm{e}-1)$ is optimal. Indeed, consider the case where $p_{i}=1/n$ , $i\in[n]$ , and $\mathbb{P}(Z=1)=1$ . This makes $\mathbb{P}(\tilde{Z}>0)=1-(1-1/n)^{n}\to 1-1/\mathrm{e}$ as $n\to\infty$ .

Despite its elegance, Proposition 1 will likely have limited applications, since in practice, the techniques for upper-bounding $\mathop{\mathbb{E}}\max_{i}X_{i}$ rely on the union bound and are insensitive to the dependence structure of $X_{i}$ — in which case the technique employed in upper-bounding $\mathop{\mathbb{E}}\max_{i}X_{i}$ automatically upper-bounds $\mathop{\mathbb{E}}\max_{i}\tilde{X}_{i}$ as well.

Decoupling from below.

A more interesting and useful direction would be to obtain an estimate of the form $\mathbb{P}(Z>0)\gtrsim\mathbb{P}(\tilde{Z}>0)$ . Clearly, no such dimension-free estimate can hold without further assumptions on the $X_{i}$ . Indeed, for a small $\varepsilon>0$ , let $\mathbb{P}(X_{1}=X_{2}=\ldots=X_{n}=1)=\varepsilon$ and $\mathbb{P}(X_{1}=X_{2}=\ldots=X_{n}=0)=1-\varepsilon$ . In this case, $\mathbb{P}(Z>0)=\varepsilon$ . On the other hand, $\mathbb{P}(\tilde{Z}>0)=1-(1-\varepsilon)^{n}=n\varepsilon+O(\varepsilon^{2})$ , and so $\mathbb{P}(\tilde{Z}>0)/\mathbb{P}(Z>0)\to n$ as $\varepsilon\to 0$ . Nor can the ratio exceed $n$ , since

[TABLE]

Let us recall the notion of pairwise independence. For the Bernoulli case, it means that for each $i\neq j\in[n]$ , we have $\mathop{\mathbb{E}}[X_{i}X_{j}]=\mathop{\mathbb{E}}[X_{i}]\mathop{\mathbb{E}}[X_{j}]$ . The main result of this note is that pairwise independence suffices for $\mathbb{P}(Z>0)\gtrsim\mathbb{P}(\tilde{Z}>0)$ .

Proposition 2.

Let $X_{i},\tilde{X}_{i},Z,\tilde{Z},p_{i}$ be as above, and assume additionally that the $X_{i}$ are pairwise independent. Then

[TABLE]

Proof.

By the Paley-Zygmund inequality,111 We thank Ron Peled for the suggestion of applying Paley-Zygmund to $Z$ .

[TABLE]

Now $\mathop{\mathbb{E}}Z=\sum_{i=1}^{n}p_{i}$ and, by pairwise independence,

[TABLE]

Hence,

[TABLE]

On the other hand, $\mathbb{P}(\tilde{Z}>0)$ is readily computed:

[TABLE]

Therefore, to prove the claim, it suffices to show that

[TABLE]

To this end,222 This elegant proof that $F\geq 0$ is due to D. Berend, who also corrected a mistake in an earlier, clunkier proof of ours. we factorize $F=SG$ , where $G=S+P+SP-1$ , $S=\sum_{i}p_{i}$ and $P=\prod_{i}(1-p_{i})$ . Thus, $F\geq 0\iff G\geq 0$ and in particular, it suffices to verify the latter. Now if $S\geq 1$ then obviously $G\geq 0$ and we are done. Otherwise, since $P\geq 1-S$ trivially holds, we have $G\geq S(1-S)$ . In this case, $S<1\implies G\geq 0$ . ∎

We conjecture that the constant $\frac{1}{2}$ in Proposition 2 is not optimal. For a fixed $n$ , define the joint pairwise independent distribution on $(X_{1},\ldots,X_{n})$ — conjecturally, an extremal one for minimizing $\mathbb{P}(Z=0)/\mathbb{P}(\tilde{Z}>0)$ — as follows: $p_{i}=1/(n-1)$ , $i\in[n]$ , $\mathbb{P}(Z=0)=\frac{1}{2}-\frac{1}{2(n-1)}$ , and $\mathbb{P}(Z=2)=1-\mathbb{P}(Z=0)$ . This makes $\mathbb{P}(\tilde{Z}>0)=1-(1-1/(n-1))^{n}\to 1-1/\mathrm{e}$ as $n\to\infty$ . If our conjecture is correct, the optimal constant for the lower bound is $c^{\prime}=\frac{\mathrm{e}}{2(\mathrm{e}-1)}$ , or exactly twice Pinelis’s constant.333 We thank Daniel Berend, Alexander Goldenshluger, and Yuval Peres for raising the question of the constants. AG (and also Omer Ben-Porat) pointed out a possible connection to prophet inequalities — and in particular, the Bernoulli selection lemma in Correa et al. (2017)

Relaxing pairwise independence.

An inspection of the proof shows that we do not actually need $\mathop{\mathbb{E}}[X_{i}X_{j}]=p_{i}p_{j}$ , but rather only $\mathop{\mathbb{E}}[X_{i}X_{j}]\leq p_{i}p_{j}$ . This condition is called negative (pairwise) covariance (Dubhashi and Ranjan, 1998).

3 Positive real case

In this section, we assume that $X_{1},\ldots,X_{n}$ are nonnegative integrable random variables and the $\tilde{X}_{1},\ldots,\tilde{X}_{n}$ are their independent copies: each $\tilde{X}_{i}$ is distributed identically to $X_{i}$ and the $\tilde{X}_{i}$ are mutually independent.

As a warmup, let us see how Proposition 1 yields $\mathop{\mathbb{E}}\max_{i\in[n]}X_{i}\lesssim\mathop{\mathbb{E}}\max_{i\in[n]}\tilde{X}_{i}$ :

Proposition 3.

Let $X_{1},\ldots,X_{n}$ be nonnegative and integrable with independent copies $\tilde{X}_{i}$ as above. For $c=\mathrm{e}/(\mathrm{e}-1)$ , we have

[TABLE]

Proof.

For $t>0$ and $i\in[n]$ , put $Y_{i}(t)=\boldsymbol{1}[X_{i}>t]$ , $\tilde{Y}_{i}(t)=\boldsymbol{1}[\tilde{X}_{i}>t]$ and $Z(t)=\sum_{i=1}^{n}Y_{i}(t)$ , $\tilde{Z}(t)=\sum_{i=1}^{n}Y_{i}(t)$ . Then

[TABLE]

∎

For pairwise independent $X_{i}$ , we have a reverse inequality:

Proposition 4.

Let $X_{1},\ldots,X_{n}$ be nonnegative and integrable with independent copies $\tilde{X}_{i}$ as above. If additionally the $X_{i}$ are pairwise independent, then

[TABLE]

Proof.

The proof is entirely analogous to that of Proposition 3, except that Proposition 2 is invoked in the inequality step. ∎

Relaxing pairwise independence.

As before, the full strength of pairwise independence of the $X_{i}$ is not needed. The condition $\mathbb{P}(X_{i}>t,X_{j}>t)\leq\mathbb{P}(X_{i}>t)\mathbb{P}(X_{j}>t)$ for all $i\neq j\in[n]$ and $t>0$ would suffice; it is weaker than pairwise negative upper orthant dependence (Joag-Dev and Proschan, 1983).444 Thanks to Murat Kocaoglu for this reference.

4 Back to Bernoulli: beyond negative covariance

What if the Bernoulli $X_{i}$ do not satisfy the negative covariance condition $\mathop{\mathbb{E}}[X_{i}X_{j}]\leq p_{i}p_{j}$ ? Proposition 2 is not directly inapplicable, but not all is lost. For $i\neq j\in[n]$ , define $\eta_{ij}$ by

[TABLE]

and put $\eta_{ii}:=0$ . Thus, $\mathop{\mathbb{E}}[X_{i}X_{j}]\leq p_{i}p_{j}+\eta_{ij}$ , and, repeating the calculation in Eq. (1),

[TABLE]

Let us put $S=\sum_{i=1}^{n}p_{i}$ , $A=S^{2}$ , $B=S+S^{2}$ , $C=\frac{1}{2}\mathbb{P}(\tilde{Z}>0)$ , and $H=\sum_{i,j\in[n]}\eta_{ij}$ . Now, for $A,B,C,H\geq 0$ , we have

[TABLE]

and so we obtain a generalization of Proposition 2:

Proposition 5.

Let $X_{i},\tilde{X}_{i},Z,\tilde{Z},p_{i},B,H$ be as above. Then

[TABLE]

When $H\lesssim B$ , Proposition 5 yields useful estimates.

Bibliography7

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Cohen and Kontorovich [2022] Doron Cohen and Aryeh Kontorovich. Local glivenko-cantelli, 2022. URL https://arxiv.org/abs/2209.04054 .
2Correa et al. [2017] José Correa, Patricio Foncea, Ruben Hoeksma, Tim Oosterwijk, and Tjark Vredeveld. Posted price mechanisms for a random stream of customers. In Proceedings of the 2017 ACM Conference on Economics and Computation , EC ’17, page 169–186, New York, NY, USA, 2017. Association for Computing Machinery. ISBN 9781450345279. doi: 10.1145/3033274.3085137 . URL https://doi.org/10.1145/3033274.3085137 . · doi ↗
3Dubhashi and Ranjan [1998] Devdatt Dubhashi and Desh Ranjan. Balls and bins: a study in negative dependence. Random Struct. Algorithms , 13(2):99–124, September 1998. ISSN 1042-9832. doi: 10.1002/(SICI)1098-2418(199809)13:2¡99::AID-RSA 1¿3.0.CO;2-M . URL http://dx.doi.org/10.1002/(SICI)1098-2418(199809)13:2<99::AID-RSA 1>3.0.CO;2-M . · doi ↗
4Joag-Dev and Proschan [1983] Kumar Joag-Dev and Frank Proschan. Negative Association of Random Variables with Applications. The Annals of Statistics , 11(1):286 – 295, 1983. doi: 10.1214/aos/1176346079 . URL https://doi.org/10.1214/aos/1176346079 . · doi ↗
5Pinelis [2022] Iosif Pinelis. Max decoupling inequality. Math Overflow, 2022. URL https://mathoverflow.net/q/422636 .
6Talagrand [2016] Michel Talagrand. Upper and lower bounds for stochastic processes . Springer, 2016.
7van Handel [2014] Ramon van Handel. Probability in high dimension. Technical report, PRINCETON UNIV NJ, 2014.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Decoupling Maximal Inequalities

Abstract

1 Motivation

2 The Bernoulli case

Decoupling from above.

Proposition 1** (Pinelis).**

Proof.

Decoupling from below.

Proposition 2**.**

Proof.

Relaxing pairwise independence.

3 Positive real case

Proposition 3**.**

Proof.

Proposition 4**.**

Proof.

Relaxing pairwise independence.

4 Back to Bernoulli: beyond negative covariance

Proposition 5**.**

Proposition 1 (Pinelis).

Proposition 2.

Proposition 3.

Proposition 4.

Proposition 5.