A New Proof of Nonsignalling Multiprover Parallel Repetition Theorem

Himanshu Tyagi; Shun Watanabe

arXiv:1901.11105·cs.IT·February 1, 2019

A New Proof of Nonsignalling Multiprover Parallel Repetition Theorem

Himanshu Tyagi, Shun Watanabe

PDF

TL;DR

This paper provides a new information-theoretic proof of the nonsignalling multiprover parallel repetition theorem, enhancing understanding of hardness of approximation in computational complexity.

Contribution

It introduces a novel proof technique based on a recent method for strong converse results, replacing de Finetti decompositions with a change of measure approach.

Findings

01

New proof simplifies understanding of the theorem

02

Extends techniques from multiuser information theory

03

Potentially impacts hardness of approximation results

Abstract

We present an information theoretic proof of the nonsignalling multiprover parallel repetition theorem, a recent extension of its two-prover variant that underlies many hardness of approximation results. The original proofs used de Finetti type decomposition for strategies. We present a new proof that is based on a technique we introduced recently for proving strong converse results in multiuser information theory and entails a change of measure after replacing hard information constraints with soft ones.

Equations105

ρ (G)

ρ (G)

\displaystyle\hskip 56.9055ptf_{1}\in{\mathcal{F}}({\mathcal{U}}_{1}|{\mathcal{X}}_{1}),f_{2}\in{\mathcal{F}}({\mathcal{U}}_{2}|{\mathcal{X}}_{2})\big{\}}.

ω^{\land n} (x_{1}^{n}, x_{2}^{n}, u_{1}^{n}, u_{2}^{n}) := j = 1 ⋀ n ω (x_{1 j}, x_{2 j}, u_{1 j}, u_{2 j}),

ω^{\land n} (x_{1}^{n}, x_{2}^{n}, u_{1}^{n}, u_{2}^{n}) := j = 1 ⋀ n ω (x_{1 j}, x_{2 j}, u_{1 j}, u_{2 j}),

ρ (G^{\land n})

ρ (G^{\land n})

\displaystyle\hskip 56.9055ptf_{1}\in{\mathcal{F}}({\mathcal{U}}_{1}^{n}|{\mathcal{X}}_{1}^{n}),f_{2}\in{\mathcal{F}}({\mathcal{U}}_{2}^{n}|{\mathcal{X}}_{2}^{n})\big{\}}.

ρ (G^{\land n}) \leq C (ρ (G))^{- \frac{n}{l o g ∣ U _{1} ∣∣ U _{2} ∣}} .

ρ (G^{\land n}) \leq C (ρ (G))^{- \frac{n}{l o g ∣ U _{1} ∣∣ U _{2} ∣}} .

P_{U_{1} U_{2} ∣ X_{1} X_{2}} (u_{1}, u_{2} ∣ x_{1}, x_{2})

P_{U_{1} U_{2} ∣ X_{1} X_{2}} (u_{1}, u_{2} ∣ x_{1}, x_{2})

= f _{2} \in F ( U _{2} ∣ X _{2} ) f _{1} \in F ( U _{1} ∣ X _{1} ) \sum μ (f_{1}, f_{2}) δ_{f_{1}, f_{2}} (u_{1}, u_{2} ∣ x_{1}, x_{2})

ρ (G) = P_{U_{1} U_{2} ∣ X_{1} X_{2}} \in P_{HVT} max E [ω (X_{1}, X_{2}, U_{1}, U_{2})] .

ρ (G) = P_{U_{1} U_{2} ∣ X_{1} X_{2}} \in P_{HVT} max E [ω (X_{1}, X_{2}, U_{1}, U_{2})] .

ρ (G)

ρ (G)

\displaystyle\hskip 28.45274pt{\mathrm{P}_{U_{1}U_{2}|X_{1}X_{2}}}\mbox{ s.t. }U_{1}-\!\!\!\!\circ\!\!\!\!-X_{1}-\!\!\!\!\circ\!\!\!\!-X_{2}-\!\!\!\!\circ\!\!\!\!-U_{2}\big{\}},

P_{U_{1} ∣ X_{1} X_{2}} (u_{1} ∣ x_{1}, x_{2}) P_{U_{2} ∣ X_{1} X_{2}} (u_{2} ∣ x_{1}, x_{2}) = P_{U_{1} ∣ X_{1} X_{2}} (u_{1} ∣ x_{1}, x_{2}^{'}), = P_{U_{2} ∣ X_{1} X_{2}} (u_{2} ∣ x_{1}^{'}, x_{2})

P_{U_{1} ∣ X_{1} X_{2}} (u_{1} ∣ x_{1}, x_{2}) P_{U_{2} ∣ X_{1} X_{2}} (u_{2} ∣ x_{1}, x_{2}) = P_{U_{1} ∣ X_{1} X_{2}} (u_{1} ∣ x_{1}, x_{2}^{'}), = P_{U_{2} ∣ X_{1} X_{2}} (u_{2} ∣ x_{1}^{'}, x_{2})

P_{HVT} \subset P_{NS} .

P_{HVT} \subset P_{NS} .

ρ_{NS} (G) := P_{U_{1} U_{2} ∣ X_{1} X_{2}} \in P_{NS} max E [ω (X_{1}, X_{2}, U_{1}, U_{2})] .

ρ_{NS} (G) := P_{U_{1} U_{2} ∣ X_{1} X_{2}} \in P_{NS} max E [ω (X_{1}, X_{2}, U_{1}, U_{2})] .

P_{U_{1} U_{2} ∣ X_{1} X_{2}}^{PR} (u_{1}, u_{2} ∣ x_{1}, x_{2}) = \frac{1}{2} 1 [u_{1} \oplus u_{2} = x_{1} \land x_{2}] .

P_{U_{1} U_{2} ∣ X_{1} X_{2}}^{PR} (u_{1}, u_{2} ∣ x_{1}, x_{2}) = \frac{1}{2} 1 [u_{1} \oplus u_{2} = x_{1} \land x_{2}] .

ρ_{NS} (G^{\land n}) \leq C (ρ_{NS} (G))^{- n} .

ρ_{NS} (G^{\land n}) \leq C (ρ_{NS} (G))^{- n} .

P_{U_{M} ∣ X_{M}} (u_{M} ∣ x_{M})

P_{U_{M} ∣ X_{M}} (u_{M} ∣ x_{M})

= f_{i} \in F (U_{i} ∣ X_{i}), i \in M \sum μ (f_{M}) δ_{f_{M}} (u_{M} ∣ x_{M}),

ρ (G) = P_{U_{M} ∣ X_{M}} \in P_{HVT} max E [ω (X_{M}, U_{M})] .

ρ (G) = P_{U_{M} ∣ X_{M}} \in P_{HVT} max E [ω (X_{M}, U_{M})] .

P_{U_{A} ∣ X_{M}} (u_{A} ∣ x_{A}, x_{A^{c}})

P_{U_{A} ∣ X_{M}} (u_{A} ∣ x_{A}, x_{A^{c}})

ρ_{NS} (G) = P_{U_{M} ∣ X_{M}} \in P_{NS} max E [ω (X_{M}, U_{M})] .

ρ_{NS} (G) = P_{U_{M} ∣ X_{M}} \in P_{NS} max E [ω (X_{M}, U_{M})] .

ω (x_{M}, u_{M}) = {1, 0, u_{i} = u_{j} if x_{i} = x_{j} otherwise,

ω (x_{M}, u_{M}) = {1, 0, u_{i} = u_{j} if x_{i} = x_{j} otherwise,

P_{U_{A} ∣ X_{M}} (u_{A} ∣ x_{A}, x_{A^{c}}) \leq Q_{U_{A} ∣ X_{A}} (u_{A} ∣ x_{A}),

P_{U_{A} ∣ X_{M}} (u_{A} ∣ x_{A}, x_{A^{c}}) \leq Q_{U_{A} ∣ X_{A}} (u_{A} ∣ x_{A}),

d_{var} (P_{\tilde{U}_{A} \tilde{X}_{M}}, P_{X_{M}} Q_{U_{A} ∣ X_{A}}) \leq ε_{A} .

d_{var} (P_{\tilde{U}_{A} \tilde{X}_{M}}, P_{X_{M}} Q_{U_{A} ∣ X_{A}}) \leq ε_{A} .

d_{var} (P_{\tilde{U}_{M} \tilde{X}_{M}}, P_{X_{M}} P_{U_{M} ∣ X_{M}}^{'}) \leq ε_{\emptyset} + 2 \emptyset \neq = A ⊊ M \sum ε_{A} .

d_{var} (P_{\tilde{U}_{M} \tilde{X}_{M}}, P_{X_{M}} P_{U_{M} ∣ X_{M}}^{'}) \leq ε_{\emptyset} + 2 \emptyset \neq = A ⊊ M \sum ε_{A} .

ρ_{NS} (G) < 1 - ε ⟹ ρ_{SNS} (G) < 1 - \frac{ε}{Γ} .

ρ_{NS} (G) < 1 - ε ⟹ ρ_{SNS} (G) < 1 - \frac{ε}{Γ} .

ρ_{SNS} (G^{n}, Δ)

ρ_{SNS} (G^{n}, Δ)

\displaystyle\hskip 56.9055pt{\mathrm{P}_{U_{\mathcal{M}}^{n}|X_{\mathcal{M}}^{n}}}\in{\mathcal{P}}_{\mathtt{SNS}}({\mathcal{U}}_{\mathcal{M}}^{n}|{\mathcal{X}}_{\mathcal{M}}^{n})\big{\}},

\displaystyle\rho_{\mathtt{SNS}}(G^{n},\Delta)\leq\exp\bigg{(}-n\frac{\nu^{2}}{C_{m}}\bigg{)},

\displaystyle\rho_{\mathtt{SNS}}(G^{n},\Delta)\leq\exp\bigg{(}-n\frac{\nu^{2}}{C_{m}}\bigg{)},

η_{NS} (G, δ)

η_{NS} (G, δ)

\displaystyle I(\tilde{U}_{\mathcal{A}}\wedge\tilde{X}_{{\mathcal{A}}^{c}}|\tilde{X}_{\mathcal{A}})+D({\mathrm{P}_{\tilde{X}_{\mathcal{M}}}}\|{\mathrm{P}_{X_{\mathcal{M}}}})\leq\delta,\forall{\mathcal{A}}\subsetneq{\mathcal{M}}.\big{\}}

η_{NS} (G^{n}, n δ) = n \cdot η_{NS} (G, δ) .

η_{NS} (G^{n}, n δ) = n \cdot η_{NS} (G, δ) .

E [N_{ω} (\tilde{X}_{M}, \tilde{U}_{M})]

E [N_{ω} (\tilde{X}_{M}, \tilde{U}_{M})]

= n E [ω (\tilde{X}_{M, J}, \tilde{U}_{M, J})],

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

A New Proof of Nonsignalling Multiprover Parallel Repetition

Theorem

Himanshu Tyagi*†*

Shun Watanabe*‡*

Abstract

We present an information theoretic proof of the nonsignalling multiprover parallel repetition theorem, a recent extension of its two-prover variant that underlies many hardness of approximation results. The original proofs used de Finetti type decomposition for strategies. We present a new proof that is based on a technique we introduced recently for proving strong converse results in multiuser information theory and entails a change of measure after replacing hard information constraints with soft ones.

††footnotetext: *†*Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore 560012, India. Email: [email protected]. *‡*Department of Computer and Information Sciences, Tokyo University of Agriculture and Technology, Tokyo 184-8588, Japan. Email: [email protected].

I Introduction

The parallel repetition theorem is an important tool in theoretical computer science which is used to prove hardness of approximation results. It shows roughly that if distributed provers can satisfy a random predicate with probability $v<1$ without coordinating, then they can satisfy $n$ independent copies of the same predicate only with probability going to [math] exponentially in $n$ . Such a theorem for two-prover case was shown in [12], with a simplified proof given in [6]. The precise form of the statement of such a theorem relies on the structure of the query distribution, the predicate, and the class of strategies allowed for the provers. In particular, it was noted in [6] that in most applications we only need a parallel repetition theorem for nonsignalling strategies, a class of correlation that subsumes even quantum correlations.

While the validity of a multiprover parallel repetition theorem for the standard setting is unclear, recently such a theorem has been proved for the nonsignalling setting [8] (see, also, [1, 2]). The proof uses de Finetti type decomposition of strategies and a linear programming interpretation of the value function. In this paper, we provide a new proof of the same result that is completely “information theoretic”. Our proof draws on the connection between the parallel repetition setting and that of multiuser rate-distortion theory. In particular, we rely on a change of measure approach for proving strong converse results in multiuser distributed coding problems. This approach was introduced for centralized coding problems in [5], and was recently sophisticated and extended to distributed coding problems in [13] using a relaxation technique introduced in [9]; see [13] for detailed account. In the change of measure approach, we first replace the hard information constraints involving conditional independence by their soft counterparts involving bounds on KL divergences. Next, we change measure to that obtained by conditioning on the “winning” event. The $n$ -fold problem is related to a single instance of the problem using a tensorization property of the resulting value function.

This paper is part review – we recall the formulation and results for two provers in Section II, followed by those for the multiprover setting in Section III. Our main contribution is a new proof of the multiprover parallel repetition theorem (Theorem 4) given in Section IV. The final section contains brief concluding remarks.

Notation. Given random variable $(X_{1},...,X_{m})$ , for a subset ${\mathcal{A}}$ of $\{1,...,m\}$ , we abbreviate the random variable $(X_{i},i\in{\mathcal{A}})$ as $X_{\mathcal{A}}$ . Similarly, for a tupple $(x_{1},...,x_{m})$ , denote $x_{\mathcal{A}}=(x_{i},i\in{\mathcal{A}})$ . For other notations, we basically follow [4].

II Two-Prover Parallel Repetition Theorem

We begin by reviewing the two-prover setting. A two-prover game $G$ consists of a verifier and two-provers ${\mathcal{P}}_{1}$ and ${\mathcal{P}}_{2}$ . The verifier samples a query $(X_{1},X_{2})$ according to a fixed joint distribution ${\mathrm{P}_{X_{1}X_{2}}}$ on finite alphabet ${\mathcal{X}}_{1}\times{\mathcal{X}}_{2}$ , and sends $X_{1}$ and $X_{2}$ to ${\mathcal{P}}_{1}$ and ${\mathcal{P}}_{2}$ , respectively. Upon receiving the queries, ${\mathcal{P}}_{1}$ and ${\mathcal{P}}_{2}$ send responses $U_{1}\in{\mathcal{U}}_{1}$ and $U_{2}\in{\mathcal{U}}_{2}$ , respectively, where $U_{i}$ depends only on $X_{i}$ . They may use any mappings $f_{i}$ , $i=1,2$ , of $X_{i}$ to get $U_{i}$ ; for finite sets ${\mathcal{U}}$ and ${\mathcal{X}}$ , denote by ${\mathcal{F}}({\mathcal{U}}|{\mathcal{X}})$ the set of all mappings from $f:{\mathcal{X}}\to{\mathcal{U}}$ . The provers win the game if $\omega(X_{1},X_{2},U_{1},U_{2})=1$ for a prespecified predicate $\omega:{\mathcal{X}}_{1}\times{\mathcal{X}}_{2}\times{\mathcal{U}}_{1}\times{\mathcal{U}}_{2}\to\{0,1\}$ . We will represent the game $G$ by the pair $({\mathrm{P}_{X_{1}X_{2}}},\omega)$ . The goal of the provers is to choose mappings $(f_{1},f_{2})$ that maximize the winning probability. This maximum winning probability is termed the value of the game and is given by

[TABLE]

In $n$ parallel repetitions of the game, the verifier samples sequences of queries $X_{1}^{n}$ and $X_{2}^{n}$ according to the product distribution $\mathrm{P}_{X_{1}X_{2}}^{n}$ . The provers now respond with sequences $U_{1}^{n}\in{\mathcal{U}}_{1}^{n}$ and $U_{2}^{n}\in{\mathcal{U}}_{2}^{n}$ where $U_{i}^{n}$ depends only on $X_{i}^{n}$ , $i=1,2$ . They win the game if predicates for each coordinate are satisfied, namely the predicate $\omega^{\wedge n}$ for the parallel repetition game $G^{\wedge n}$ is given by

[TABLE]

where $\bigwedge$ denotes the AND function. The value of $G^{\wedge n}$ is defined similarly as follows:

[TABLE]

As a simple attempt towards winning the parallel repetition game, provers may simply apply strategies for single instance of the game across each coordinate. In fact, they may use a different strategy for each coordinate. Clearly, any such attempt will have value less than $\rho(G)^{n}$ . But can they do better by using other functions $f_{i}$ that take into account the entire vector $X_{i}^{n}$ and do not have a product structure across coordinates? At a high level, a parallel repetition theorem says that the answer is no: The exponential decay of value with $n$ is unavoidable.

The first instance of parallel repetition theorem was shown by Raz [12] (see [6] for simpler proof).

Theorem 1 ([12]).

There exists a function $C:[0,1]\to[0,1]$ satisfying $C(t)<1$ if $t<1$ such that for any game $G$ ,

[TABLE]

The statement above holds for any game $G$ with the same universal function $C(\cdot)$ and universal exponent that depends only on the cardinality of the response set ${\mathcal{U}}_{1}\times{\mathcal{U}}_{2}$ .

An important aspect of the setting above, which will be a prime focus here, is the role of randomness in response strategies. A simple derandomization argument shows that the value of games will not change if the pair $(f_{1},f_{2})$ is generated randomly using shared randomness $V$ that is independent of the query. Such strategies with shared randomness available to the provers can be described by channels

[TABLE]

where $\mu$ is a distribution on ${\mathcal{F}}({\mathcal{U}}_{1}|{\mathcal{X}}_{1})\times{\mathcal{F}}(U_{2}|{\mathcal{X}}_{2})$ and $\delta_{f_{1},f_{2}}$ given by $\delta_{f_{1},f_{2}}(u_{1},u_{2}|x_{1},x_{2}):={\mathds{1}}_{\{u_{1}=f_{1}(x_{1}),u_{2}=f_{2}(x_{2})\}}$ is the deterministic strategy induced by functions $f_{1},f_{2}$ .

In physics, strategies of the form (1) are said to satisfy the hidden variable theory, a classical physics principle which says that if all the hidden variables are revealed then the state of the world will be deterministic. We denote the set of all such strategies by ${\mathcal{P}}_{\mathtt{HVT}}={\mathcal{P}}_{\mathtt{HVT}}({\mathcal{U}}_{1}\times{\mathcal{U}}_{2}|{\mathcal{X}}_{1}\times{\mathcal{X}}_{2})$ . With this new notation at our disposal and using the observation above that shared randomness does not improve the value of a game, we can express $\rho(G)$ alternatively as

[TABLE]

Note that since strategies using shared randomness can perform at best as deterministic strategies, the same must be true for strategies using independent private randomness at the provers. Thus, yet another alternative form of $\rho(G)$ is given by

[TABLE]

namely we can consider maximization over Markov chains $U_{1}-\!\!\!\!\circ\!\!\!\!-X_{1}-\!\!\!\!\circ\!\!\!\!-X_{2}-\!\!\!\!\circ\!\!\!\!-U_{2}$ with marginal of $(X_{1},X_{2})$ fixed to ${\mathrm{P}_{X_{1}X_{2}}}$ .

It is important to examine the limitation posed by restricting to strategies in ${\mathcal{P}}_{\mathtt{HVT}}$ . In fact, a contentious debate in physics revolving around statistical modeling of quantum measurements was finally settled in the second half of the previous century through quantitative distinction between correlations allowed in hidden variable theory and more general nonsignalling correlation.

For our setting, we can define the class of nonsignaling strategies as follows.

Definition 1 (Nonsignalling strategies).

Let ${\mathcal{P}}_{\mathtt{NS}}={\mathcal{P}}_{\mathtt{NS}}({\mathcal{U}}_{1}\times{\mathcal{U}}_{2}|{\mathcal{X}}_{1}\times{\mathcal{X}}_{2})$ be the set of all strategies ${\mathrm{P}_{U_{1}U_{2}|X_{1}X_{2}}}$ satisfying

[TABLE]

for every $x_{1}\neq x_{1}^{\prime}$ and $x_{2}\neq x_{2}^{\prime}$ . Equivalently, we can express these conditions as $I(U_{1}\wedge X_{2}|X_{1})=I(U_{2}\wedge X_{1}|X_{2})=0$ , namely the Markov relations $U_{1}-\!\!\!\!\circ\!\!\!\!-X_{1}-\!\!\!\!\circ\!\!\!\!-X_{2}$ and $U_{2}-\!\!\!\!\circ\!\!\!\!-X_{2}-\!\!\!\!\circ\!\!\!\!-X_{1}$ hold.

Note that these strategies include as a special case the “long Markov strategies” satisfying $U_{1}-\!\!\!\!\circ\!\!\!\!-X_{1}-\!\!\!\!\circ\!\!\!\!-X_{2}-\!\!\!\!\circ\!\!\!\!-U_{2}$ . This latter class performs as well as ${\mathcal{P}}_{\mathtt{HVT}}$ . In fact, it is easy to verify that strategies in ${\mathcal{P}}_{\mathtt{HVT}}$ satisfy (3), which yields

[TABLE]

In typical applications of parallel repetition theorem in complexity theory, it suffices to use a version of the theorem for nonsignalling strategies. In any case, the next question is of independent interest: Does parallel repetition theorem hold if we allow the broader class of nonsignalling strategies?

Specifically, denote by $\rho_{\mathtt{NS}}(G)$ the maximum probability of satisfying $\omega$ using nonsignalling strategies, i.e.,

[TABLE]

By (4), $\rho(G)\leq\rho_{\mathtt{NS}}(G)$ . In fact, the inequality can be strict for some games as illustrated by the next example.

Example 1.

Consider the following CHSH type Bell test experiment [3]. For ${\mathcal{X}}_{1}={\mathcal{X}}_{2}={\mathcal{U}}_{1}={\mathcal{U}}_{2}=\{0,1\}$ , let ${\mathrm{P}_{X_{1}X_{2}}}$ be the uniform distribution on $\{0,1\}^{2}$ , and let predicate $\omega$ be given by $\omega(x_{1},x_{2},u_{1},u_{2})=\mathbf{1}[u_{1}\oplus u_{2}=x_{1}\wedge x_{2}]$ . It can be seen that the winning probability of any deterministic strategy $\delta_{f_{1},f_{2}}$ is upper bounded by $3/4$ , whereby $\rho(G)\leq 3/4$ . This bound is attained by the deterministic strategy $f_{1}(x_{1})=f_{2}(x_{2})=0$ for all $x_{1},x_{2}\in\{0,1\}$ .

Next, consider the strategy given by

[TABLE]

This particular correlation is termed the Pepescu-Rohrlich box, PR box for short, since it appeared in [10]. It can be verified that this strategy satisfies the nonsignalling condition (3). But provers can win the game with probability $1$ by using this strategy. Therefore, $\rho_{\mathtt{NS}}(G)=1$ , strictly more than $\rho(G)$ .

Holenstein proved that the following version of parallel repetition theorem for nonsignaling strategies.

Theorem 2 ([6]).

There exists a function $C:[0,1]\to[0,1]$ satisfying $C(t)<1$ if $t<1$ such that for any game $G$ ,

[TABLE]

Note that now the exponent of parallel repetition theorem doesn’t even depend on the cardinality of response set. Also, we remark that the proof of Theorem 2 in [6] is much simpler than the simplified proof of Theorem 1 in the same paper.

III Multiprover Parallel Repetition Theorem

Moving to the multiprover setting, a multiprover game $G=({\mathrm{P}_{X_{\mathcal{M}}}},\omega)$ consists of a verifier and $m$ provers ${\mathcal{P}}_{1},\ldots,{\mathcal{P}}_{m}$ . Denoting ${\mathcal{M}}=\{1,...,m\}$ and $X_{\mathcal{M}}=(X_{1},\ldots,X_{m})$ , the verifier samples a query $X_{\mathcal{M}}$ according to a fixed joint distribution ${\mathrm{P}_{X_{{\mathcal{M}}}}}$ and sends $X_{i}$ to ${\mathcal{P}}_{i}$ for $i$ in ${\mathcal{M}}$ . Upon receiving the queries, each prover ${\mathcal{P}}_{i}$ sends a response $U_{i}\in{\mathcal{U}}_{i}$ , $1\leq i\leq m$ , to the verifier. The provers win the game if $\omega(X_{\mathcal{M}},U_{\mathcal{M}})=1$ for a given predicate $\omega:{\mathcal{X}}_{\mathcal{M}}\times{\mathcal{U}}_{\mathcal{M}}\to\{0,1\}$ .

As in the previous section, the provers’ strategy can be described by a channel ${\mathrm{P}_{U_{\mathcal{M}}|X_{\mathcal{M}}}}$ . The set of all strategies that can be described as convex combination of deterministic, local strategies is denoted by ${\mathcal{P}}_{\mathtt{HVT}}={\mathcal{P}}_{\mathtt{HVT}}({\mathcal{U}}_{{\mathcal{M}}}|{\mathcal{X}}_{{\mathcal{M}}})$ . Namely, ${\mathcal{P}}_{\mathtt{HVT}}$ is the set of all strategies of the form

[TABLE]

where $\mu$ is measure on $\prod_{i=1}^{m}{\mathcal{F}}({\mathcal{U}}_{i}|{\mathcal{X}}_{i})$ and $\delta_{f_{\mathcal{M}}}(u_{\mathcal{M}}|x_{\mathcal{M}})={\mathds{1}}_{\{u_{i}=f_{i}(x_{i}),i\in{\mathcal{M}}\}}$ . The value of the game that can be attained by strategies satisfying hidden variable theory is given by

[TABLE]

The parallel repetition game $G^{\wedge n}$ is defined analogously to the two-player setting.

For the multi-prover setting, a nonsignaling strategy is a channel ${\mathrm{P}_{U_{\mathcal{M}}|X_{\mathcal{M}}}}$ such that the following condition is satisfied:

[TABLE]

for all $x_{\mathcal{A}},x_{{\mathcal{A}}^{c}},{x}_{{\mathcal{A}}^{c}}^{\prime},u_{{\mathcal{A}}}$ and all subsets ${\mathcal{A}}$ of ${\mathcal{M}}$ . Denoting the set of all nonsignaling strategies by ${\mathcal{P}}_{\mathtt{NS}}={\mathcal{P}}_{\mathtt{NS}}({\mathcal{U}}_{\mathcal{M}}|{\mathcal{X}}_{\mathcal{M}})$ , the value of the game that can be attained by nonsignaling strategies is given by

[TABLE]

A general parallel repetition theorem for strategies in ${\mathcal{P}}_{\mathtt{HVT}}$ is not known. As we have mentioned at the end of the previous section, proving parallel repetition theorem for strategies in ${\mathcal{P}}_{\mathtt{NS}}$ is relatively easier than that for strategies in ${\mathcal{P}}_{\mathtt{HVT}}$ ; the former is known to hold under the condition that query distribution ${\mathrm{P}_{X_{\mathcal{M}}}}$ has full support [1, 2]. Remarkably, without the full support condition, a counterexample appeared in [7] for a parallel repetion theorem for ${\mathcal{P}}_{\mathtt{NS}}$ . Specifically, it was shown that the following three-prover game satisfies $\rho_{\mathtt{NS}}(G^{\wedge n})=2/3$ for all $n$ (this example appeared first in [1]):

Example 2 (Anticorrelation game).

For $m=3$ , let ${\mathcal{X}}_{i}={\mathcal{U}}_{i}=\{0,1\}$ , $1\leq i\leq 3$ . Let query distribution ${\mathrm{P}_{X_{\mathcal{M}}}}$ be uniform on all strings $(x_{1},x_{2},x_{3})$ with Hamming weight $2$ . The required game $G$ is given by predicate

[TABLE]

i.e., the responses are identical at the two locations where queries are $1$ .

This example rules out a parallel repetition theorem for ${\mathcal{P}}_{\mathtt{NS}}$ in general. In other words, $\rho_{\mathtt{NS}}(G)<1$ is not sufficient to claim the exponential decay of winning probability in parallel repetition games. In fact, even preceeding this counterexample, a parallel repetition theorem, i.e., exponential decay, was shown to hold if the value of the single game for a broader class of strategies, called sub-nonsignalling strategies, is strictly less than $1$ [8].

Sub-nonsignaling strategies ${\mathrm{P}_{U_{\mathcal{M}}|X_{\mathcal{M}}}}$ , which we define next, need not be conditional distributions and are only required to be subnormalized, namely we only need it to be nonnegative and satisfying $\sum_{u_{\mathcal{M}}}{\mathrm{P}_{U_{\mathcal{M}}|X_{\mathcal{M}}}}(u_{\mathcal{M}}|x_{\mathcal{M}})\leq 1$ . Both total variation distances and KL diveregence can be applied to such subnormalized distribution. We remark that the marginal ${\mathrm{P}_{Y}}$ and the conditional distribution ${\mathrm{P}_{Y|X}}$ , respectively, for a subnormalized distribution ${\mathrm{P}_{XY}}$ are defined as $\mathrm{P}_{Y}\left({y}\right)=\sum_{x}\mathrm{P}_{XY}\left({x,y}\right)$ and $\mathrm{P}_{Y|X}\left({y|x}\right)=\mathrm{P}_{XY}\left({x,y}\right)/\mathrm{P}_{Y}\left({y}\right)$ . While ${\mathrm{P}_{Y}}$ , too, is a subnormalized distribution, ${\mathrm{P}_{Y|X}}$ will be a (normalized) distribution.

Definition 2 (Sub-nonsignalling strategies).

The set ${\mathcal{P}}_{\mathtt{SNS}}$ of sub-nonsignalling strategies consists of subnormalized ${\mathrm{P}_{U_{\mathcal{M}}|X_{\mathcal{M}}}}$ such that, for each subsets ${\mathcal{A}}$ of ${\mathcal{M}}$ , there exists a channel ${\mathrm{Q}_{U_{\mathcal{A}}|X_{\mathcal{A}}}}$ satisfying:

[TABLE]

for all $x_{\mathcal{A}},x_{{\mathcal{A}}^{c}},u_{{\mathcal{A}}}$ .

Note that nonsignalling strategies are those for which the inequality condition above is replaced with identity. Heuristically, sub-nonsignalling strategies maybe regarded as the class of strategies close to nonsignalling stratetigies in statistical distance. Another heuristic was suggested in [8] which interpretted sub-nonsignalling strategies as nonsignalling strategies with addition $x_{\mathcal{M}}$ dependent power to randomly abstain from responding. In fact, we can find a sub-nonsignalling strategy close to a distribution for which all conditional distributions ${\mathrm{P}_{U_{\mathcal{A}}|X_{\mathcal{M}}}}$ are close to some conditional distributions ${\mathrm{Q}_{U_{\mathcal{A}}|X_{\mathcal{A}}}}$ .111Lemma 3 is a multiprover extension of [6, Lemma 9.5] which showed that in the two-prover setting we can find a nonsignalling $\mathrm{P}^{\prime}_{U_{\mathcal{M}}|X_{\mathcal{M}}}$ .

Lemma 3 ([8, Lemma 5.2]).

Let ${\mathrm{P}_{X_{\mathcal{M}}}}$ be a query distribution on ${\mathcal{X}}_{\mathcal{M}}$ , and let ${\mathrm{P}_{\tilde{U}_{\mathcal{M}}\tilde{X}_{\mathcal{M}}}}$ be a probability distribution on ${\mathcal{U}}_{\mathcal{M}}\times{\mathcal{X}}_{\mathcal{M}}$ . Suppose that for each ${\mathcal{A}}\subsetneq{\mathcal{M}}$ there exist a conditional distribution ${\mathrm{Q}_{U_{\mathcal{A}}|X_{\mathcal{A}}}}$ such that

[TABLE]

Then, there exists a sub-nonsignaling $\mathrm{P}^{\prime}_{U_{\mathcal{M}}|X_{\mathcal{M}}}$ such that

[TABLE]

By definition, the value of the game that is attained by sub-nonsignaling strategies satisfy $\rho_{\mathtt{SNS}}(G)\geq\rho_{\mathtt{NS}}(G)$ . For two-prover games, $\rho_{\mathtt{SNS}}(G)$ was shown in [8] to coincide with $\rho_{\mathtt{NS}}(G)$ . However, equality may not hold for multiprover games, in general. Indeed, the game in Example 2 has $\rho_{\mathtt{NS}}(G)=2/3$ and $\rho_{\mathtt{SNS}}(G)=1$ . Interestingly, when the query distribution ${\mathrm{P}_{X_{\mathcal{M}}}}$ has full support, there exists a constant $\Gamma=\Gamma({\mathrm{P}_{X_{\mathcal{M}}}})$ such that, for $\varepsilon>0$ , ( $cf$ . [8])

[TABLE]

Before we state the parallel repetition theorem for sub-nonsignalling strategies, we switch to a slightly more general formulation where in the $n$ parallel repition game, instead of winning all the games, we are interested in quantifying the probability that the provers win more than a fraction $\Delta$ of the game. This formulation is closer to the rate-distortion theory formulation of information theory and appeared, for instance, in [11]. Specifically, for $0<\Delta\leq 1$ , consider

[TABLE]

where $N_{\omega}(x_{\mathcal{M}}^{n},u_{\mathcal{M}}^{n}):=\sum_{j=1}^{n}\omega(x_{{\mathcal{M}},j},u_{{\mathcal{M}},j})$ . Since $\omega(x_{{\mathcal{M}},j},u_{{\mathcal{M}},j})$ is the indicator for a win in the $i$ th coordinate, $N_{\omega}(x_{\mathcal{M}}^{n},u_{\mathcal{M}}^{n})$ denotes the total number of wins. Analogously, ${{\mathcal{P}}_{\mathtt{NS}}}$ is defined by restricting the maximum in (7) to nonsignalling strategies; our original definition $\rho_{\mathtt{NS}}(G^{\wedge n})$ coincides with $\rho_{\mathtt{NS}}(G^{n},1)$ .

We now recall the multiprover parallel repetition theorem from [8].

Theorem 4.

Let $G=({\mathrm{P}_{X_{\mathcal{M}}}},\omega)$ be a multiprover game with $\rho_{\mathtt{SNS}}(G)<1$ . For any $\Delta\geq\rho_{\mathtt{SNS}}(G)+\nu$ with $0<\nu\leq 1-\rho_{\mathtt{SNS}}(G)$ , we have

[TABLE]

where the constant $C_{m}={\mathcal{O}}(2^{2m})$ depends only on $m=|{\mathcal{M}}|$ .

For multiprover games with full support query distributions, Theorem 4 together with (6) implies the parallel repetition theorem for nonsignaling strategies, shown first in [2].

The proof of the multiprover parallel repetition theorem for nonsignalign strategies and full support query distribution in [2] entails extending the proof approach for the two-prover setting in [6]. An alternative proof was provided in [1] by using a technique based on de Finetti theorem. At a high level, this technique allows us to restrict attention to convex combinations of product strategies. In [8], the parallel repetition theorem for sub-nonsignaling strategies, namely Theorem 4, was proved by using another variant of de Finetti theorem.

In the next section, we provide an alternative proof of Theorem 4. Our proof is based on a technique recently developed by the authors in [13] to prove strong converse theorems for multi-user information theory problems. A crucial observation is that the parallel repetition theorem can be regarded as an exponential strong converse of a multi-user rate-distortion problem with no communication. In contrast to the proof in [8] that uses a structural decomposition of strategies, our proof is completely “information theoretic”.

IV A New Proof of Theorem 4

Our goal is to derive an upper bound for the maximum probability of the event ${\mathcal{C}}$ of winning more than $\Delta$ fraction of games. Following [13], we start with a change of measure (query distribution and provers’ strategy) by conditioning on ${\mathcal{C}}$ . The “distance” between the new distribution and the original distribution are bounded in terms of the exponent of the probability of ${\mathcal{C}}$ . However, since we have conditioned the strategy on the winning event, the information structure may break down – we are only guaranteed to be “close” to a distribution satisfying our original information constraints. Nonetheless, to complete the proof we need an appropriate single-letterization argument to relate this new game to one instance of the original game.

To enable this, our proof looks at the expected number of wins instead of the probability of winning. For a given multiprover game $G=({\mathrm{P}_{X_{\mathcal{M}}}},\omega)$ and $\delta\geq 0$ , define

[TABLE]

Note that the maximum is over the set of distributions, which we call $\delta$ -approximate nonsignaling distributions, that satisfy the information structure only approximately. In particular, we have replaced the hard information constraints required by nonsignalling strategies by their soft counterparts expressed by bounds on KL diveregence. Below we shall see two properties of $\eta_{\mathtt{NS}}(G,\delta)$ : it tensorizes and can be bounded above roughly by $\rho_{\mathtt{SNS}}(G)$ . We note that a linear programming based notion of approximate nonsignaling strategies was used in [6, 2, 1, 8]. Our divergence based notion of approximation is amenable to tensorization and facilitates an information theoretic proof.

We can now apply our proof recipe outlined earlier. Under the changed measure obtained by conditioning on ${\mathcal{C}}$ , the expected number of wins is more than $n\Delta$ . Also, this new measure satisfies the soft information constraint bound with $\delta$ equal to the exponent of probability of ${\mathcal{C}}$ . Thus, $\eta_{\mathtt{NS}}(G^{n},\delta)$ must be more than $n\Delta$ . Using the properties of $\eta_{\mathtt{NS}}(G^{n},\delta)$ mentioned earlier, we can bound it above roughly by $n\rho_{\mathtt{SNS}}(G)$ , which shows that $\Delta$ must be roughly bounded above by $\rho_{\mathtt{SNS}}(G)$ . The required bound for exponent is obtained by the contrapositive statement.

Formal arguments follow. We begin with the tensorization property.

Lemma 5.

For a given multiprover game $G=({\mathrm{P}_{X_{\mathcal{M}}}},\omega)$ , $n\in\mathbb{N}$ and $\delta\geq 0$ , we have

[TABLE]

Proof.

The inequality $\eta_{\mathtt{NS}}(G^{n},n\delta)\geq n\eta_{\mathtt{NS}}(G,\delta)$ holds by definition. For the other direction, fix a $n\delta$ -approximate nonsignalling distribution ${\mathrm{P}_{\tilde{U}_{\mathcal{M}}^{n}\tilde{X}_{\mathcal{M}}^{n}}}$ . We have

[TABLE]

where $J$ is distributed uniformly on $\{1,\ldots,n\}$ . Furthermore,

[TABLE]

where the first inequality follows from [13, Proposition 1] and the second and the third inequalities hold since conditioning decreases entropy. Thus, ${\mathrm{P}_{\tilde{U}_{{\mathcal{M}},J},\tilde{X}_{{\mathcal{M}},J}}}$ is a $\delta$ -approximate nonsignalling distribution and the claim follows by (8). ∎

Next, we relate $\eta_{\mathtt{NS}}(G,\delta)$ and $\rho_{\mathtt{SNS}}(G)$ using Lemma 3.

Lemma 6.

For a given multiprover game $G=({\mathrm{P}_{X_{\mathcal{M}}}},\omega)$ and $\delta\geq 0$ , we have

[TABLE]

where the constant $C_{m}^{\prime}={\mathcal{O}}(2^{m})$ depends only on $m=|{\mathcal{M}}|$ .

Proof.

Consider a $\delta$ -approximate nonsignalling distribution ${\mathrm{P}_{\tilde{U}_{\mathcal{M}}\tilde{X}_{\mathcal{M}}}}$ . For any ${\mathcal{A}}\subsetneq{\mathcal{M}}$ , since $I(\tilde{U}_{\mathcal{A}}\wedge\tilde{X}_{{\mathcal{A}}^{c}}|\tilde{X}_{\mathcal{A}})=D({\mathrm{P}_{\tilde{U}_{\mathcal{A}}\tilde{X}_{\mathcal{M}}}}\|{\mathrm{P}_{\tilde{X}_{\mathcal{M}}}}{\mathrm{P}_{\tilde{U}_{\mathcal{A}}|\tilde{X}_{\mathcal{A}}}})\leq\delta$ and $D({\mathrm{P}_{\tilde{X}_{\mathcal{M}}}}\|{\mathrm{P}_{X_{\mathcal{M}}}})\leq\delta$ , by using Pinsker’s inequality [4] and the triangle inequality, we get

[TABLE]

Next, by applying Lemma 3 with $\varepsilon_{\mathcal{A}}=\sqrt{(2\ln 2)\delta}$ , there exists a sub-nonsignaling strategy $\mathrm{P}^{\prime}_{U_{\mathcal{M}}|X_{\mathcal{M}}}$ such that

[TABLE]

Finally, since $\omega$ is bounded by $1$ , we have

[TABLE]

where the final inequality uses the fact that $\mathrm{P}^{\prime}_{\tilde{U}_{\mathcal{M}}|\tilde{X}_{\mathcal{M}}}$ is sub-nonsignalling. We obtain the claimed bound with $C_{m}^{\prime}=2(2^{m+1}-3)$ since ${\mathrm{P}_{\tilde{U}_{\mathcal{M}}\tilde{X}_{\mathcal{M}}}}$ was an arbitrary $\delta$ -approximate nonsignalling distribution. ∎

We have all the tools for the proof of Theorem 4 in place.

Proof of Theorem 4

If $\rho_{\mathtt{SNS}}(G^{n},\Delta)>\exp(-n\delta)$ , we can find a sub-nonsignalling strategy ${\mathrm{P}_{U_{\mathcal{M}}^{n}|X_{\mathcal{M}}^{n}}}$ such that ${\mathbb{P}}\left(N_{\omega}(U_{\mathcal{M}},X_{\mathcal{M}}^{n})\geq n\Delta\right)>\exp(-n\delta)$ for some $\delta>0$ . Denoting

[TABLE]

we change the measure by conditioning on the event $(U_{\mathcal{M}}^{n},X_{\mathcal{M}}^{n})\in{\mathcal{C}}$ as follows:222Although ${\mathrm{P}_{U_{\mathcal{M}}X_{\mathcal{M}}}}$ is only a subnormalized distribution, the changed measure ${\mathrm{P}_{\tilde{U}_{\mathcal{M}}\tilde{X}_{\mathcal{M}}}}$ is a distribution.

[TABLE]

Then, by a simple calculation, we have

[TABLE]

Furthermore, for each ${\mathcal{A}}\subsetneq{\mathcal{M}}$ , denoting by ${\mathrm{Q}_{U_{\mathcal{A}}^{n}|X_{\mathcal{A}}^{n}}}$ the dominating conditional distribution for the sub-nonsignaling strategy ${\mathrm{P}_{U_{\mathcal{A}}^{n}|X_{\mathcal{A}}^{n}}}$ (cf. (5)), we have

[TABLE]

where the second inequality follows from the sub-nonsignaling condition (5) and the third inequality uses the fact that ${\mathrm{P}_{U_{{\mathcal{A}}^{c}}^{n}|U_{\mathcal{A}}^{n}X_{\mathcal{M}}^{n}}}$ is a conditional distribution. The above bound implies that the changed measure $\mathrm{P}_{\tilde{U}_{\mathcal{M}}^{n}\tilde{X}_{\mathcal{M}}^{n}}$ is $\delta$ -approximate nonsignalig distribution. Furthermore, since $N_{\omega}(\tilde{X}_{\mathcal{M}},\tilde{U}_{\mathcal{M}}^{n})\geq n\Delta$ holds with probability $1$ under the changed measure ${\mathrm{P}_{\tilde{U}_{\mathcal{M}}^{n}\tilde{X}_{\mathcal{M}}^{n}}}$ , we have

[TABLE]

which together with Lemma 5 and Lemma 6 implies

[TABLE]

By considering contraposition, if

[TABLE]

then we have $\rho_{\mathtt{SNS}}(G^{n},\Delta)\leq\exp(-n\delta)$ . Thus, by setting $\delta=\frac{\nu^{2}}{(2\ln 2)(C_{m}^{\prime}+1)^{2}}$ , $\Delta\geq\rho_{\mathtt{SNS}}(G)+\nu$ implies (9), and we have the claim of the theorem. ∎

V Discussion

A multiprover parallel repetition theorem for standard strategies, i.e., strategies satisfying the hidden variable theory, is not available. In fact, our initial attempt in this work was to provide an alternative proof of the two-prover parallel repetition theorem for the standard strategies. We tried to prove a counterpart of the tensorization property, Lemma 5, for standard strategies. However, our preliminary attempt failed, mainly because it was difficult to identify a suitable soft constraint for the long Markov chain in (2). Nonetheless, we do believe that our measure change approach can be used to obtain a parallel repetition theorem for standard strategies, perhaps by proving an approximate tensorization property of the value function with suitable soft constraints.

Bibliography13

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] R. Arnon-Friedman, R. Renner, and T. Vidick, “Non-signaling parallel repetition using de Finetti reductions,” IEEE Trans. Inf. Theory , vol. 62, no. 3, pp. 1440–1457, November 2016.
2[2] H. Buhrman, S. Fehr, and C. Schaffner, “On the parallel repetition of multi-player games: The no-signaling case,” in Leibniz International Proceedings in Informatics , 2014, pp. 24–35.
3[3] J. F. Clauser, M. A. Horne, A. Shimony, and R. A. Holt, “Proposed experiment to test local hidden-variable theories,” Physical Review Letters , vol. 23, no. 15, pp. 880–884, October 1969.
4[4] I. Csiszár and J. Körner, Information theory: Coding theorems for discrete memoryless channels. 2nd edition . Cambridge University Press, 2011.
5[5] W. Gu and M. Effros, “A strong converse for a collection of network source coding problem,” Proc. IEEE International Symposium on Information Theory , pp. 2316–2320, 2009.
6[6] T. Holenstein, “Parallel repetition: Simplifications and the no-signaling case,” Theory of Computing , vol. 5, pp. 141–172, 2009.
7[7] J. Homgren and L. Yang, “(A counterexample to) Parallel repetition for non-signaling multi-player games,” 2018, http://people.csail.mit.edu/holmgren/papers/ns-parrep.pdf.
8[8] C. Lancien and A. Winter, “Parallel repetition and concentration for (sub-)no-signalling games via a flexible constrained de Finetti reduction.” Chicago J. Theor. Comput. Sci. , no. 11, pp. 1–22, 2016.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

A New Proof of Nonsignalling Multiprover Parallel Repetition

Abstract

I Introduction

II Two-Prover Parallel Repetition Theorem

Theorem 1** ([12]).**

Definition 1** (Nonsignalling strategies).**

Example 1**.**

Theorem 2** ([6]).**

III Multiprover Parallel Repetition Theorem

Example 2** (Anticorrelation game).**

Definition 2** (Sub-nonsignalling strategies).**

Lemma 3** ([8, Lemma 5.2]).**

Theorem 4**.**

IV A New Proof of Theorem 4

Lemma 5**.**

Proof.

Lemma 6**.**

Proof.

Proof of Theorem 4

V Discussion

Theorem 1 ([12]).

Definition 1 (Nonsignalling strategies).

Example 1.

Theorem 2 ([6]).

Example 2 (Anticorrelation game).

Definition 2 (Sub-nonsignalling strategies).

Lemma 3 ([8, Lemma 5.2]).

Theorem 4.

Lemma 5.

Lemma 6.