Playing Games with Bounded Entropy: Convergence Rate and Approximate   Equilibria

Mehrdad Valizadeh; Amin Gohari

arXiv:1902.03676·cs.GT·February 12, 2019

Playing Games with Bounded Entropy: Convergence Rate and Approximate Equilibria

Mehrdad Valizadeh, Amin Gohari

PDF

Open Access

TL;DR

This paper analyzes the convergence rate of the long-run value in zero-sum repeated games with limited randomness strategies and characterizes the set of approximate Nash equilibria related to this value.

Contribution

It introduces a new simulation tool for sources based on Rénnyi entropies and characterizes approximate equilibria in games with bounded entropy strategies.

Findings

01

Convergence rate of $v_n$ to its limit is exponentially bounded.

02

Simulation precision depends on Rénnyi entropy difference.

03

Set of approximate equilibria closely relates to the long-run max-min value.

Abstract

We consider zero-sum repeated games in which the players are restricted to strategies that require only a limited amount of randomness. Let $v_{n}$ be the max-min value of the $n$ stage game; previous works have characterized $lim_{n \to \infty} v_{n}$ , i.e., the long-run max-min value. Our first contribution is to study the convergence rate of $v_{n}$ to its limit. To this end, we provide a new tool for simulation of a source (target source) from another source (coin source). Considering the total variation distance as the measure of precision, this tool offers an upper bound for the precision of simulation, which is vanishing exponentially in the difference of R\'enyi entropies of the coin and target sources. In the second part of paper, we characterize the set of all approximate Nash equilibria achieved in long run. It turns out that this set is in close relation with the long-run…

Equations371

∥ p_{f (X) Y} - p_{A} p_{Y} ∥_{T V} \leq 2^{- (1 - \frac{1}{α}) (H_{α} (X ∣ Y) - H_{\frac{1}{α}} (A) + 2)},

∥ p_{f (X) Y} - p_{A} p_{Y} ∥_{T V} \leq 2^{- (1 - \frac{1}{α}) (H_{α} (X ∣ Y) - H_{\frac{1}{α}} (A) + 2)},

∥ p_{f (X^{n}) Y^{n}} - p_{A^{n}} p_{Y^{n}} ∥_{T V}

∥ p_{f (X^{n}) Y^{n}} - p_{A^{n}} p_{Y^{n}} ∥_{T V}

\leq 2^{- n (1 - \frac{1}{α}) (H_{α} (X ∣ Y) - H_{\frac{1}{α}} (A))},

p (x^{n}) = i = 1 \prod n p (x_{i}) .

p (x^{n}) = i = 1 \prod n p (x_{i}) .

∥ p_{X} - q_{X} ∥_{T V} ≜ \frac{1}{2} x \in X \sum ∣ p_{X} (x) - q_{X} (x) ∣.

∥ p_{X} - q_{X} ∥_{T V} ≜ \frac{1}{2} x \in X \sum ∣ p_{X} (x) - q_{X} (x) ∣.

H (X) = x \in X \sum - p_{X} (x) lo g (p_{X} (x)),

H (X) = x \in X \sum - p_{X} (x) lo g (p_{X} (x)),

H (X ∣ Y)

H (X ∣ Y)

= y \in Y \sum p_{Y} (y) H (X ∣ Y = y),

H_{α} (X) = \frac{α}{1 - α} lo g (x \in X \sum p_{X} (x)^{α})^{\frac{1}{α}} = \frac{α}{1 - α} lo g ∥ p_{X} ∥_{α},

H_{α} (X) = \frac{α}{1 - α} lo g (x \in X \sum p_{X} (x)^{α})^{\frac{1}{α}} = \frac{α}{1 - α} lo g ∥ p_{X} ∥_{α},

H_{α} (X ∣ Y) = \frac{α}{1 - α} lo g y \in Y \sum p_{Y} (y) ∥ p_{X ∣ Y = y} ∥_{α},

H_{α} (X ∣ Y) = \frac{α}{1 - α} lo g y \in Y \sum p_{Y} (y) ∥ p_{X ∣ Y = y} ∥_{α},

α \to 1 lim H_{α} (X) = H (X), α \to 1 lim H_{α} (X ∣ Y) = H (X ∣ Y) .

α \to 1 lim H_{α} (X) = H (X), α \to 1 lim H_{α} (X ∣ Y) = H (X ∣ Y) .

d_{1}(X)\triangleq-\frac{d}{d_{\alpha}}H_{\alpha}(X)\Big{|}_{\alpha=1}=\frac{1}{2\log e}\left(\sum_{x\in\mathcal{X}}p(x)(\log(p(x)))^{2}-H(X)^{2}\right).

d_{1}(X)\triangleq-\frac{d}{d_{\alpha}}H_{\alpha}(X)\Big{|}_{\alpha=1}=\frac{1}{2\log e}\left(\sum_{x\in\mathcal{X}}p(x)(\log(p(x)))^{2}-H(X)^{2}\right).

H_{α} (X) = H (X) - d_{1} (X) (α - 1) + R_{X} (α),

H_{α} (X) = H (X) - d_{1} (X) (α - 1) + R_{X} (α),

∣ R_{X} (α) ∣ \leq d_{2} (X) (α - 1)^{2},

∣ R_{X} (α) ∣ \leq d_{2} (X) (α - 1)^{2},

d_{2}(X)=\frac{1}{2}\textrm{ }\underset{1/2\leq\alpha^{\prime}\leq 2}{\max\textrm{ }}\left|\frac{d^{2}H_{\alpha}(X)}{d\alpha^{2}}\Big{|}_{\alpha=\alpha^{\prime}}\right|.

d_{2}(X)=\frac{1}{2}\textrm{ }\underset{1/2\leq\alpha^{\prime}\leq 2}{\max\textrm{ }}\left|\frac{d^{2}H_{\alpha}(X)}{d\alpha^{2}}\Big{|}_{\alpha=\alpha^{\prime}}\right|.

H_{α} (X ∣ Y) = H (X ∣ Y) - d_{1} (X ∣ Y) (α - 1) + R_{X ∣ Y} (α),

H_{α} (X ∣ Y) = H (X ∣ Y) - d_{1} (X ∣ Y) (α - 1) + R_{X ∣ Y} (α),

\displaystyle d_{1}(X|Y)=-\frac{d}{d_{\alpha}}H_{\alpha}(X|Y)\Big{|}_{\alpha=1}=\sum_{y\in\mathcal{Y}}p_{Y}(y)d(p_{X|Y=y})+\frac{1}{2\log e}\left(\sum_{y\in\mathcal{Y}}p_{Y}(y)H(X|Y=y)^{2}-H(X|Y)^{2}\right).

\displaystyle d_{1}(X|Y)=-\frac{d}{d_{\alpha}}H_{\alpha}(X|Y)\Big{|}_{\alpha=1}=\sum_{y\in\mathcal{Y}}p_{Y}(y)d(p_{X|Y=y})+\frac{1}{2\log e}\left(\sum_{y\in\mathcal{Y}}p_{Y}(y)H(X|Y=y)^{2}-H(X|Y)^{2}\right).

∣ R_{X} (α) ∣ \leq d_{2} (X ∣ Y) (α - 1)^{2},

∣ R_{X} (α) ∣ \leq d_{2} (X ∣ Y) (α - 1)^{2},

d_{2}(X|Y)=\frac{1}{2}\underset{1/2\leq\alpha^{\prime}\leq 2}{\max}\left|\frac{d^{2}H_{\alpha}(X|Y)}{d\alpha^{2}}\Big{|}_{\alpha=\alpha^{\prime}}\right|.

d_{2}(X|Y)=\frac{1}{2}\underset{1/2\leq\alpha^{\prime}\leq 2}{\max}\left|\frac{d^{2}H_{\alpha}(X|Y)}{d\alpha^{2}}\Big{|}_{\alpha=\alpha^{\prime}}\right|.

λ (σ^{n}, τ^{n}) = E_{σ^{n}, τ^{n}} [\frac{1}{n} t = 1 \sum n u_{A_{t} B_{t}}],

λ (σ^{n}, τ^{n}) = E_{σ^{n}, τ^{n}} [\frac{1}{n} t = 1 \sum n u_{A_{t} B_{t}}],

U^{(A)} (p_{A}) = b \in B min a \in A \sum p_{A} (a) u_{ab} .

U^{(A)} (p_{A}) = b \in B min a \in A \sum p_{A} (a) u_{ab} .

J^{(A)} (h) = p_{A} \in Δ (A), H (p_{A}) \leq h max U^{(A)} (p_{A}) .

J^{(A)} (h) = p_{A} \in Δ (A), H (p_{A}) \leq h max U^{(A)} (p_{A}) .

λ (σ^{n}, τ^{n}) \geq J_{cav}^{(A)} (H (X ∣ Y)) - μ (\frac{1}{n} + \frac{1}{f _{n}} + \frac{f _{n}}{n} + g_{n} + 2^{- \frac{1}{2} (\frac{n}{f _{n}} - 1) h_{n} (β g_{n} - γ h_{n})}) .

λ (σ^{n}, τ^{n}) \geq J_{cav}^{(A)} (H (X ∣ Y)) - μ (\frac{1}{n} + \frac{1}{f _{n}} + \frac{f _{n}}{n} + g_{n} + 2^{- \frac{1}{2} (\frac{n}{f _{n}} - 1) h_{n} (β g_{n} - γ h_{n})}) .

λ (σ^{n}, τ^{n}) \geq J_{cav}^{(A)} (H (X ∣ Y)) - O (g_{n}) = J_{cav}^{(A)} (H (X ∣ Y)) - O (\frac{lo g n}{4 n}) .

λ (σ^{n}, τ^{n}) \geq J_{cav}^{(A)} (H (X ∣ Y)) - O (g_{n}) = J_{cav}^{(A)} (H (X ∣ Y)) - O (\frac{lo g n}{4 n}) .

2^{- \frac{1}{2} (\frac{n}{f _{n}} - 1) h_{n} (β g_{n} - γ h_{n})} = O (2^{- \frac{n g _{n}^{2}}{2 r ^{2}}}) = O (\frac{1}{n}) .

2^{- \frac{1}{2} (\frac{n}{f _{n}} - 1) h_{n} (β g_{n} - γ h_{n})} = O (2^{- \frac{n g _{n}^{2}}{2 r ^{2}}}) = O (\frac{1}{n}) .

2^{- \frac{1}{2} (\frac{n}{f _{n}} - 1) h_{n} (β g_{n} - γ h_{n})} .

2^{- \frac{1}{2} (\frac{n}{f _{n}} - 1) h_{n} (β g_{n} - γ h_{n})} .

λ (σ^{n}, τ^{n}) \geq J_{cav}^{(A)} (H (X ∣ Y)) - μ (\frac{1}{n} + g_{n} + 2^{- \frac{1}{2} n h_{n} (β g_{n} - γ h_{n})}) .

λ (σ^{n}, τ^{n}) \geq J_{cav}^{(A)} (H (X ∣ Y)) - μ (\frac{1}{n} + g_{n} + 2^{- \frac{1}{2} n h_{n} (β g_{n} - γ h_{n})}) .

λ (σ^{n}, τ^{n}) \geq J_{cav}^{(A)} (H (X ∣ Y)) - O (\frac{lo g n}{n}) .

λ (σ^{n}, τ^{n}) \geq J_{cav}^{(A)} (H (X ∣ Y)) - O (\frac{lo g n}{n}) .

q_{A} \in p_{A} \in Δ (A) ar g max b \in B min a \in A \sum p_{A} (a) u_{ab} .

q_{A} \in p_{A} \in Δ (A) ar g max b \in B min a \in A \sum p_{A} (a) u_{ab} .

λ (σ^{n}, τ^{n}) \geq J_{cav}^{(A)} (H (X ∣ Y)) - γ 2^{- β n} .

λ (σ^{n}, τ^{n}) \geq J_{cav}^{(A)} (H (X ∣ Y)) - γ 2^{- β n} .

∥ p_{f (X) Y} - p_{A} p_{Y} ∥_{T V} \leq ϵ,

∥ p_{f (X) Y} - p_{A} p_{Y} ∥_{T V} \leq ϵ,

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMathematical Biology Tumor Growth · Game Theory and Applications · Markov Chains and Monte Carlo Methods

Full text

Playing Games with Bounded Entropy: Convergence Rate and Approximate Equilibria

Mehrdad Valizadeh

[email protected]

Amin Gohari

[email protected]

Department of Electrical Engineering, Sharif University of Technology, Tehran, Iran

Abstract

We consider zero-sum repeated games in which the players are restricted to strategies that require only a limited amount of randomness. Let $v_{n}$ be the max-min value of the $n$ stage game; previous works have characterized $\lim_{n\rightarrow\infty}v_{n}$ , i.e., the long-run max-min value. Our first contribution is to study the convergence rate of $v_{n}$ to its limit. To this end, we provide a new tool for simulation of a source (target source) from another source (coin source). Considering the total variation distance as the measure of precision, this tool offers an upper bound for the precision of simulation, which is vanishing exponentially in the difference of Rényi entropies of the coin and target sources. In the second part of paper, we characterize the set of all approximate Nash equilibria achieved in long run. It turns out that this set is in close relation with the long-run max-min value.

keywords:

Repeated Games , Bounded Entropy , Randomness Extraction , Source Simulation , Information Theory

††journal: Games and Economic Behavior

1 Introduction
1.1 A new tool
2 Preliminaries
2.1 Notations
2.2 Shannon Entropy
2.3 Rényi Entropy
3 Repeated games with leaked randomness source: convergence rate
3.1 Problem statement and results
3.2 A technical tool: simulation of a source from another source
3.3 Proof of Theorem 6
3.3.1 Proof of Lemma 16
3.4 Proof of Theorem 9
3.5 Proof of Theorem 11
4 Approximate Nash equilibria of the repeated game with leaked randomness source
4.1 Proof of Theorem 23
4.2 Approximate Nash equilibria achieved by autonomous strategies
Appendix A Proof of Lemma 14
Appendix B Proof of Proposition 15
Appendix C Proof of Theorem 27
Appendix C.1 Achievability proof
Appendix C.2 Converse proof

1 Introduction

Nash (1950) showed that all one-shot games have at least one equilibrium in the mixed strategies. Private randomness is required to implement mixed strategies, and consequently a Nash equilibrium may not exist if insufficient random bits are available to the players (See Hubáček et al. (2016) and Budinich and Fortnow (2011)).

Limited randomness in repeated zero-sum games was originally studied by Neyman and Okada (2000) and Gossner and Vieille (2002). Gossner and Vieille (2002) studied a repeated zero-sum game between Alice (the maximizer) and Bob (the minimizer). At the beginning of each stage of the game, Alice observed an independent drawing of a random source $X$ with a commonly known distribution. Next, the players played an action which was monitored by the other player. The only source of randomization available to Alice was the outcomes of random source $X$ . Thus, Alice had to choose the action of each stage as a deterministic function of the history of her observations, i.e., the random sources revealed up to that stage and the previous actions. However, Bob could freely randomize his actions, and hence, at each stage, he chose his action as a random function of the actions played previously. Generalizing the model of Gossner and Vieille (2002), Valizadeh and Gohari (2017) considered the possibility of leakage of Alice’s random source sequence to Bob; thus, they called it the repeated game with leaked randomness source. In other words, Bob monitored the random source of Alice through a noisy channel. Specifically, let $(X_{1},Y_{1})$ , $(X_{2},Y_{2}),\ldots$ be a sequence of independent and identically distributed (i.i.d.) random variables distributed according to a given distribution $p_{XY}$ . At arbitrary stage $t$ , before choosing the actions for that stage, Alice observed $X_{t}$ , and Bob observed $Y_{t}$ . In this model, Alice and Bob could randomize their actions at each stage just by conditioning their actions to the history of their observations up to that stage.

In this paper, we study two different aspects of the repeated game with leaked randomness sources. Our first contribution is to study the max-min payoff that Alice can secure in a repeated game with finite number of stages. Note that Valizadeh and Gohari (2017) characterized the long run max-min value, i.e., the maximum payoff that Alice can secure regardless of what strategy Bob chooses when the number of stages tends to infinity. More precisely, let $v_{n}$ be the max-min value of the $n$ -stage repeated game with leaked randomness source. Valizadeh and Gohari (2017) characterized $\lim_{n\to\infty}v_{n}$ . In this paper, we investigate how $v_{n}$ converges to its limit. To do so, we develop and utilize a new tool for simulation of a source from another source, which we will introduce later in Section 1.1.

Our second contribution is to study the set of equilibria that is implementable by Alice and Bob in the repeated game with leaked randomness sources. As stated above, implementable Nash equilibria do not necessarily exist. However, a relaxed version of Nash equilibria called approximate Nash equilibria may exist. Let $\epsilon_{A}$ and $\epsilon_{B}$ be arbitrary positive numbers. We say a given strategy profile forms a $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium if Alice and Bob do not gain more than $\epsilon_{A}$ and $\epsilon_{B}$ , respectively, by unilaterally changing their corresponding strategies. We characterize the set of $(\epsilon_{A},\epsilon_{B})$ -Nash equilibria of the repeated game when the number of stages of the game tends to infinity. This set is characterized in terms of the maximum payoffs that Alice and Bob can secure in long run (long run max-min and min-max values).

Note that in previous works (Neyman and Okada (2000); Gossner and Vieille (2002); Valizadeh and Gohari (2017)), the max-min (or min-max) value of the zero-sum repeated game was achieved by autonomous strategies – a strategy that is indifferent about the actions of the opponent in past stages. Therefore, we address the question as to whether autonomous strategies are sufficient for achieving all implementable approximate Nash equilibria. To do this, we also characterize the set of all approximate Nash equilibria achieved by autonomous strategies in long run. It will turn out that the set of approximate equilibria achieved by autonomous strategies is absolutely smaller than the set of approximate equilibria achieved by arbitrary strategies.

1.1 A new tool

A key step in the proofs of Gossner and Vieille (2002) and Valizadeh and Gohari (2017) is to divide the total $n$ stages of the repeated game into some blocks such that the actions of the first player in each block (excluding the first block) is generated as a function of the randomness source observed during the previous block.111This strategy is known as the block Markov strategy in information theory and utilized in multi-hop communication settings. In other words, the actions of the first player in each block is simulated from the randomness source observed in the previous block. Since we are interested in the non-asymptotic regime where the number of stages $n$ is given and fixed, we need to carefully optimize over the length of the blocks and also prove a fine estimate on the accuracy of simulation of the actions of each block from the observations of the previous block. Thus, in order to study the repeated game with $n$ stages, we provide a new tool for simulation of a source from another source which is of independent interest.

More precisely, in abstract terms, let $X\in\mathcal{X}$ and $Y\in\mathcal{Y}$ be arbitrary discrete random variables distributed according to some probability mass function $p_{XY}$ , and let $A\in\mathcal{A}$ be a target random variable distributed according to $p_{A}$ . We would like to simulate $A$ from $X$ (by using a deterministic function $f:\mathcal{X}\to\mathcal{A}$ ) in such a way that the resulting random variable, $f(X)$ , is almost independent of $Y$ , and its distribution is close to $p_{A}$ . Intuitively, if the amount of uncertainty of $X$ given $Y$ is much more than the amount of uncertainty of $A$ , then, one might find a simulator $f:\mathcal{X}\to\mathcal{A}$ satisfying the above conditions. We take the Rényi entropy as our measure of uncertainty, and the total variation distance as our measure of similarity. We prove that there exists a mapping $f:\mathcal{X}\to\mathcal{A}$ such that for arbitrary $1\leq\alpha\leq 2$ ,

[TABLE]

where $\|.\|_{TV}$ denotes the total variation distance, $H_{\alpha}(X|Y)$ denotes the conditional Rényi entropy (with parameter $\alpha$ ) of $X$ given $Y$ , and $H_{\frac{1}{\alpha}}(A)$ is the Rényi entropy of $A$ with parameter $1/\alpha$ . The main idea to prove Equation (1) is to relate it to norms of linear maps and then utilize the Riesz-Thorin Interpolation Theorem.

To better understand Equation (1), let us apply it to a sequence of random variables. Assume that $(X_{1},Y_{1})$ , $(X_{2},Y_{2})$ , …, $(X_{n},Y_{n})$ are $n$ i.i.d. repetitions according to $p_{XY}$ . Our goal is to simulate $(A_{1},A_{2},\ldots,A_{n})$ , which is an i.i.d. sequence according to $p_{A}$ . Applying Equation (1) to $\tilde{X}=(X_{1},\ldots,X_{n})$ , $\tilde{Y}=(Y_{1},\ldots,Y_{n})$ , and $\tilde{A}=(A_{1},\ldots,A_{n})$ , we obtain that there exists a mapping $f:\mathcal{X}^{n}\to\mathcal{A}^{n}$ such that for arbitrary $1\leq\alpha\leq 2$ ,

[TABLE]

where we used the fact that $H_{\alpha}(X^{n}|Y^{n})=nH_{\alpha}(X|Y)$ and $H_{\frac{1}{\alpha}}(A^{n})=nH_{\frac{1}{\alpha}}(A)$ . Equation (2) shows that the accuracy of simulation is improving exponentially fast in the product of three terms: the block length $n$ , the term $1-{1}/{\alpha}$ , and the entropy difference $H_{\alpha}(X|Y)-H_{\frac{1}{\alpha}}(A)$ .

Moreover, Equation (1) can be interpreted in a different way: we say that $\mathsf{R}(\cdot)$ is a measure of randomness if for any discrete random variable $X$ , $\mathsf{R}(X)$ is a non-negative real number. The value $\mathsf{R}(X)$ quantifies the amount of uncertainty in $X$ . Then, $\mathsf{R}(\cdot)$ is a reasonable measure of randomness only if it is non-increasing under mappings. In other words, if random variable $A$ is a deterministic function of random variable $X$ , we expect $\mathsf{R}(X)\geq\mathsf{R}(A)$ . The question then arises whether the converse to this statement can also be true:

Question: Is there a suitable measure of randomness $\mathsf{R}(\cdot)$ such that $\mathsf{R}(X)\geq\mathsf{R}(A)$ if and only if there is a function $f:\mathcal{X}\mapsto\mathcal{A}$ such that $f(X)$ is distributed according to $p_{A}$ ?

While the answer to this question is negative, our tool shows that an approximate version of it holds. To see why the answer to this question is negative, let $X\in\{0,1\}$ be a binary random variable. Then, $f(X)$ has the same amount of randomness as $X$ if $f$ is a one-to-one function ( $f(0)\neq f(1)$ ), and $f(X)$ is deterministic if $f(0)=f(1)$ . Therefore, $\mathsf{R}(f(X))\in\{0,\mathsf{R}(X)\}$ and cannot take values lying between [math] and $\mathsf{R}(X)$ . However, if we require $f(X)$ to have a distribution that is “approximately” equal to $p_{A}$ , the above question can be revisited. In fact, our tool shows that Rényi entropy is an answer for the approximate version of the above question.

Relation of Equation (1) to previous works: The problem of simulation of a source from another source dates back to the work of Von Neumann (1951), who considered the problem of generating a sequence of i.i.d. fair bits from a given sequence of i.i.d. unfair bits. The algorithm presented by Von Neumann (1951) is universal in the sense that it does not need the knowledge of the distribution of the input bits, and it is exact in the sense that the output bits are exactly fair. Von Neumann (1951) also offered a non-universal exact algorithm for simulation of a desired continuous distribution from a given continuous random variable with known distribution. A generalization of the algorithm of Von Neumann (1951) for arbitrary Markov inputs can be found in Elias (1972) and Bernardini and Rinaldo (2018). There are other works that have considered non-exact simulation of a source. Considering the total variation distance as the measure of accuracy, Yassaee et al. (2014) studied non-universal generation of independent fair bits from an i.i.d. sequence of random variables with side information, and Han (2003) considered the simulation of a general sequence from a general input sequence with known distribution. Fundamental limits for generation of arbitrary random sequence from a general sequence of random variables under different measures of accuracy has been studied by Vembu and Verdú (1995) and Yu and Tan (2019).

Above works considered the simulation of an intended long sequence from a long input sequence. In contrast, a different approach for generating random bits (randomness extraction) is to provide results for arbitrary single-letter sources, and then, conclude results for sequences; works of Renner (2008), Hayashi (2011) and Mojahedian et al. (2018) on randomness extraction and privacy amplification lie in this category. The tool we present in this paper generalizes the results of Renner (2008), Hayashi (2011) and Mojahedian et al. (2018); in fact, they considered the special case of simulation of random variable $A$ having a uniform distribution over a set $\mathcal{A}$ (when $A$ is uniform, simulating $A$ can be interpreted as extracting $\log|\mathcal{A}|$ bits of randomness). Furthermore, in this paper, we adopt the total variation distance as the measure of accuracy which has a close relation with the expected payoff in games. We also use concentration inequalities to provide further refinements (Proposition 15).

The rest of this paper is organized as follows: In Section 2, we introduce the notations of this paper and present a brief discussion of Shannon and Rényi entropy. The repeated game with leaked randomness source is defined in Section 3, where we also provide our results on the convergence rate of the max-min payoff of games with finite number of stages. In Section 3.2, we introduce our tool for simulation of a source from another source. In Section 4, we characterize the set of approximate Nash equilibria achievable in long run. Some of the proofs are presented in Appendices.

2 Preliminaries

2.1 Notations

In this paper, we use the notation $x^{j}$ to represent a sequence of variables $(x_{1},x_{2},\ldots,x_{j})$ . The same notation is used to represent sequences of random variables, i.e., $X^{j}=(X_{1},X_{2},\ldots,X_{j})$ . Note that this notation is used for sequences that have two subscripts the same way, i.e., $X_{k}^{j}=(X_{k,1},X_{k,2},\ldots,X_{k,j})$ . Calligraphic letters such as $\mathcal{X},\mathcal{Y},\mathcal{A},\mathcal{B},\dots$ represent finite sets, and $|\mathcal{X}|$ denotes the cardinality of the finite set $\mathcal{X}$ . Cartesian product of two sets $\mathcal{A}$ and $\mathcal{B}$ is denoted by $\mathcal{A}\times\mathcal{B}$ , and $\mathcal{A}^{n}$ stands for $n$ times cartesian product of $\mathcal{A}$ . The set of natural numbers is represented by $\mathbbm{N}$ , and $\mathbbm{R}$ denotes the set of real numbers. For a real number $a$ , $\lfloor a\rfloor$ is the largest integer less than or equal to $a$ , and $\lceil a\rceil$ is the smallest integer greater than or equal to $a$ . Furthermore, let $f(.)$ and $g(.)$ be two real functions on the set of real numbers; we write $f(a)=\mathcal{O}(g(a))$ if and only if there exists a real constant $c$ such that for all $a\in\mathbbm{R}$ , we have $|f(a)|\leq c|g(a)|$ . We use the notation $f_{n}=\mathcal{O}(g_{n})$ for real sequences $\{f_{n}\}_{n\in\mathbbm{N}}$ and $\{g_{n}\}_{n\in\mathbbm{N}}$ in the same manner.

The probability mass function (pmf) of a random variable $X$ is represented by $p_{X}(x)$ . When it is obvious from the context, we drop the subscript and use $p(x)$ instead of $p_{X}(x)$ . We say that $X^{n}$ is drawn i.i.d. from $p(x)$ if

[TABLE]

We use $\Delta(\mathcal{A})$ to denote the probability simplex on alphabet $\mathcal{A}$ , i.e., the set of all probability distributions on the finite set $\mathcal{A}$ . The total variation distance between pmfs $p_{X}$ and $q_{X}$ is denoted by $\|p_{X}-q_{X}\|_{TV}$ and is defined as:

[TABLE]

Some of the properties of the total variation distance are summarized in the following lemma.

Lemma 1.

The following properties hold for the total variation distance:

** Property 1:**

$\|p_{E}p_{F|E}-q_{E}p_{F|E}\|_{TV}=\|p_{E}-q_{E}\|_{TV}$ ;

** Property 2:**

$\|p_{E}p_{F|E}-q_{E}q_{F|E}\|_{TV}\geq\|p_{E}-q_{E}\|_{TV}$ ;

** Property 3:**

$\|p_{E_{1}}p_{F_{1}}-p_{E_{2}}q_{F_{2}}\|_{TV}\leq\|p_{E_{1}}-p_{E_{2}}\|_{TV}+\|p_{F_{1}}-q_{F_{2}}\|_{TV}$ .

2.2 Shannon Entropy

Let $X\in\mathcal{X}$ and $Y\in\mathcal{Y}$ be two random variables with joint probability distribution $p_{XY}$ and respective marginal distributions $p_{X}$ and $p_{Y}$ . The Shannon entropy of the random variable $X$ is defined to be:

[TABLE]

where $0\log(0)=0$ by continuity, and all logarithms in this paper are in base two. Since the Shannon entropy is a function of the pmf $p_{X}$ , we sometimes write $H(p_{X})$ instead of $H(X)$ .

The conditional Shannon entropy of $X$ given $Y$ is defined as:

[TABLE]

where $H(X|Y=y)=\sum_{x\in\mathcal{X}}-p_{X|Y}(x|y)\log(p_{X|Y}(x|y))$ .

The following properties hold for the entropy function:

$H(X)\geq 0.$

2.

For arbitrary deterministic function $f(x)$ , we have $H(f(X))\leq H(X)$ .

2.3 Rényi Entropy

Let $X\in\mathcal{X}$ and $Y\in\mathcal{Y}$ be two random variables with joint probability distribution $p_{XY}$ and respective marginal distributions $p_{X}$ and $p_{Y}$ . For arbitrary $\alpha>0$ , the Rényi entropy of random variable $X$ with parameter $\alpha$ is defined as follows:

[TABLE]

where $\|p_{X}\|_{\alpha}=\left(\sum_{x\in\mathcal{X}}p_{X}(x)^{\alpha}\right)^{\frac{1}{\alpha}}$ is the $\alpha$ -norm of $p_{X}$ . Since the Rényi entropy is a function of the pmf $p_{X}$ , we sometimes write $H_{\alpha}(p_{X})$ instead of $H_{\alpha}(X)$ .

The conditional Rényi entropy of $X$ given $Y$ with parameter $\alpha$ is defined as:

[TABLE]

where $p_{X|Y=y}$ is the conditional distribution of $X$ given $Y=y$ .

Rényi entropy is related to Shannon entropy by the following relations:

[TABLE]

Let us fix $X\in\mathcal{X}$ and consider $H_{\alpha}(X)$ as a function of $\alpha$ . $H_{\alpha}(X)$ is analytic for all $\alpha\geq 0$ , and hence, differentiable of all orders. In this paper, we are interested in the values of Rényi entropy for $1/2\leq\alpha\leq 2$ . Particularly, for $\alpha=1$ we have:

[TABLE]

Note that $H(X)=\sum_{x\in\mathcal{X}}p(x)\log(p(x))$ , and function $f(a)=a^{2}$ is convex. Therefore, Jensen’s inequality implies that $d_{1}(X)$ is non-negative. Using the Taylor expansion, for $1/2\leq\alpha\leq 2$ , we have:

[TABLE]

where the remainder $R_{X}(\alpha)$ is bounded as

[TABLE]

where

[TABLE]

Since $d_{1}(X)$ and $d_{2}(X)$ are functions of $p_{X}$ , instead of them, we will sometimes write $d_{1}(p_{X})$ and $d_{2}(p_{X})$ , respectively. Similarly, for the conditional Rényi entropy and for $1/2\leq\alpha\leq 2$ , we have

[TABLE]

where $R_{X|Y}(\alpha)$ is the remainder term, and

[TABLE]

Again, Jensen’s inequality implies that $d_{1}(X|Y)$ is non-negative. Moreover, the remainder $R_{X|Y}(\alpha)$ is bounded as

[TABLE]

where

[TABLE]

A more detailed analysis of the Rényi entropy with respect to the parameter $\alpha$ can be found in (Beck and Schögl, 1995, Section 5).

3 Repeated games with leaked randomness source: convergence rate

In this section, we revisit the repeated game of Gossner and Vieille (2002). Here, we focus on its general version with a leaked randomness source studied by Valizadeh and Gohari (2017). Valizadeh and Gohari (2017) characterized the max-min value of the repeated game when the number of the stages of the game tends to infinity. In contrast, we let the number of stages of the game be fixed to $n\in\mathbbm{N}$ , and investigate the rate by which the max-min value of the $n$ -stage game converges to the long-run max-min value.

3.1 Problem statement and results

Consider an $n$ stage repeated zero-sum game between players Alice( $A$ ) and Bob( $B$ ) with respective pure action sets $\mathcal{A}$ and $\mathcal{B}$ . Let $\mathcal{X}$ and $\mathcal{Y}$ be the alphabet of randomness sources of Alice and Bob, respectively, and let $p_{XY}$ be a publicly known pmf on $\mathcal{X}\times\mathcal{Y}$ . At each stage $t\in\{1,2,\dots,n\}$ , random variables $X_{t}\in\mathcal{X}$ and $Y_{t}\in\mathcal{Y}$ are drawn independent of previous drawings according to $p_{XY}$ , where $X_{t}$ is observed by Alice and $Y_{t}$ is observed by Bob. Then, Alice and Bob choose respective actions $A_{t}\in\mathcal{A}$ and $B_{t}\in\mathcal{B}$ . At the end of stage $t$ , players monitor the chosen actions $A_{t}$ and $B_{t}$ , and Alice gets stage payoff $u_{A_{t}B_{t}}$ from Bob. In order to choose $A_{t}$ and $B_{t}$ , players use the history of their observations until stage $t$ . Let $\mathsf{H}_{1}^{t}=(X_{1},A_{1},B_{1},\dots,X_{t-1},A_{t-1},B_{t-1},X_{t})$ and $\mathsf{H}_{2}^{t}=(Y_{1},A_{1},B_{1},\dots,Y_{t-1},A_{t-1},B_{t-1},Y_{t})$ denote the history of observation of Alice and Bob (respectively) up to stage $t$ . Then, $A_{t}=\sigma_{t}(\mathsf{H}_{1}^{t})$ and $B_{t}=\tau_{t}(\mathsf{H}_{2}^{t})$ , where $\sigma_{t}:(\mathcal{A}\times\mathcal{B})^{t-1}\times\mathcal{X}^{t}\to\mathcal{A}$ and $\tau_{t}:(\mathcal{A}\times\mathcal{B})^{t-1}\times\mathcal{Y}^{t}\to\mathcal{B}$ are deterministic functions by which Alice and Bob map their observations into their actions at stage $t$ . Notice that the mappings $\sigma_{t}$ and $\tau_{t}$ are deterministic which means that the only source of randomization are $\mathsf{H}_{1}^{t}$ (for Alice) and $\mathsf{H}_{2}^{t}$ (for Bob). We call the $n$ -tuples $\sigma^{n}=(\sigma_{1},\sigma_{2},\dots,\sigma_{n})$ and $\tau^{n}=(\tau_{1},\tau_{2},\dots,\tau_{n})$ the strategies of Alice and Bob, respectively. The expected average payoff for Alice up to stage $n$ induced by strategies $\sigma^{n}$ and $\tau^{n}$ is denoted by $\lambda(\sigma^{n},\tau^{n})$ :

[TABLE]

where $\mathbb{E}_{\sigma^{n},\tau^{n}}$ denotes the expectation with respect to the distribution induced by i.i.d. repetitions of $p_{XY}$ and strategies $\sigma^{n}$ and $\tau^{n}$ . Alice wishes to maximize $\lambda(\sigma^{n},\tau^{n})$ and Bob’s goal is to minimize it.

We will refer to the above game with “the repeated game with leaked randomness source”. Another variant of this game, called “the repeated game with non-causal leaked randomness source” is defined in the following remark.

Remark 2.

In the definition of the repeated game with leaked randomness source, we assumed that the randomness sources $X^{n}=(X_{1},\dots,X_{n})$ and $Y^{n}=(Y_{1},\dots,Y_{n})$ are revealed to Alice and Bob causally as the game is played out. However, we can also consider the non-causal case in which the sources $X^{n}$ and $Y^{n}$ are observed by Alice and Bob (respectively) before the game starts. In this case we have $\mathsf{H}_{1}^{t}=(X^{n},A^{t-1},B^{t-1})$ and $\mathsf{H}_{2}^{t}=(Y^{n},A^{t-1},B^{t-1})$ . In order to distinguish the above two cases, we name the non-causal game as “the repeated game with non-causal leaked randomness source”.

Definition 3.

Let $v$ be an arbitrary real value:

Alice can secure $v$ in the $n$ stage repeated game if there exists a strategy $\sigma^{n}$ for Alice such that for all strategy $\tau^{n}$ of Bob we have $\lambda(\sigma^{n},\tau^{n})\geq v$ . The maximum of the set of payoffs $v$ that Alice can secure in the $n$ stage repeated game is called the max-min value of the $n$ -stage game.

2.

Alice can secure $v$ in long run if there exists a sequence of strategies $\{\sigma^{n}\}_{n\in\mathbbm{N}}$ for Alice such that for all sequences of strategies $\{\tau^{n}\}_{n\in\mathbbm{N}}$ of Bob we have $\liminf_{n\to\infty}\lambda(\sigma^{n},\tau^{n})\geq v$ . The supremum of the set of payoffs $v$ that Alice can secure in long run is called the long run max-min value of the game.

The set of all payoffs that can be secured in long run in the repeated game with leaked randomness source is characterized by Valizadeh and Gohari (2017) and restated here as Theorem 5. Before presenting Theorem 5, we need the following definition.

Definition 4.

In a stage game, the security level of mixed action $p_{A}$ for Alice is denoted by $U^{(A)}(p_{A})$ , and is defined as follows:

[TABLE]

Furthermore, the maximum payoff that Alice can secure in a stage game, by playing mixed actions of entropy at most $\mathsf{h}$ , is denoted by $\mathcal{J}^{(A)}(\mathsf{h})$ , and is defined as:

[TABLE]

Theorem 5 (Valizadeh and Gohari (2017)).

Let $\mathcal{J}^{(A)}_{\text{cav}}(\mathsf{h})$ be the upper concave envelope of $\mathcal{J}^{(A)}(\mathsf{h})$ defined in Definition 4. In the repeated game with leaked randomness source, Alice can secure $v$ in long run if and only if $v\leq\mathcal{J}^{(A)}_{\text{cav}}(H(X|Y))$ . Furthermore, in $n\in\mathbbm{N}$ stage game, Alice can secure $v$ only if $v\leq\mathcal{J}^{(A)}_{\text{cav}}(H(X|Y))$ .

Theorem 5 implies that the long run max-min value of the repeated game with leaked randomness source is $\mathcal{J}^{(A)}_{\text{cav}}(H(X|Y))$ . Moreover, the max-min value of the $n$ -stage game is at most $\mathcal{J}^{(A)}_{\text{cav}}(H(X|Y))$ . In the following theorems we discuss how the max-min value of the $n$ -stage game converges to $\mathcal{J}^{(A)}_{\text{cav}}(H(X|Y))$ as $n$ increases.

Theorem 6.

In the repeated game with leaked randomness source, there exist real numbers $r>0$ , $\beta>0$ , $\gamma\geq 0$ and $\mu\geq 0$ , such that the following property holds: for arbitrary sequences $\{f_{n}\}_{n\in\mathbbm{N}}$ , $\{g_{n}\}_{n\in\mathbbm{N}}$ and $\{h_{n}\}_{n\in\mathbbm{N}}$ satisfying $f_{n}\in\mathbbm{N}$ , $0\leq g_{n}\leq r$ , and $0\leq h_{n}\leq 1$ , one can find a sequence of strategies $\{\sigma^{n}\}_{n\in\mathbbm{N}}$ such that for all sequences of strategies $\{\tau^{n}\}_{n\in\mathbbm{N}}$ of Bob and for all $n\in\mathbbm{N}$ we have

[TABLE]

We give an intuitive description of the terms in Equation (10) in Discussion 8 below. The formal proof of Theorem 6 is presented in Section 3.3.

Corollary 7.

In the repeated game with leaked randomness source, for each $n\in\mathbbm{N}$ , let $v_{n}$ denote the max-min value of the $n$ -stage game. $v_{n}$ converges to $\mathcal{J}^{(A)}_{\text{cav}}(H(X|Y))$ with a rate of at least ${\sqrt{\log n}}/{\sqrt[4]{n}}$ . To see this, let $r,\beta,\gamma,\mu$ be the values in the statement of Theorem 6, and let $k$ be an arbitrary positive number such that $\beta>k\gamma$ and $kr\leq 1$ . Define $f_{n}=\lceil kr^{2}(\beta-k\gamma)\sqrt{n}\rceil$ , $g_{n}=r\sqrt{\log n}/\sqrt[4]{n}$ , and $h_{n}=kg_{n}$ . Then, Theorem 6 implies that there exists a sequence of strategies $\{\sigma^{n}\}_{n\in\mathbbm{N}}$ such that for all sequences of strategies $\{\tau^{n}\}_{n\in\mathbbm{N}}$ of Bob, and for all $n\in\mathbbm{N}$ , we have

[TABLE]

To see this, observe that $\frac{1}{n}+\frac{1}{f_{n}}+\frac{f_{n}}{n}$ is decaying faster than $g_{n}$ . And

[TABLE]

Discussion 8.

We explain Equation (10) at an intuitive level. To generate the strategies $\{\sigma^{n}\}_{n\in\mathbbm{N}}$ of Theorem 6, we divide the total $n$ stages almost uniformly into $f_{n}$ blocks such that the actions of each block (besides the first block) is generated as a function of the randomness source observed during the previous block, and in all stages of the first block, an arbitrary action $a\in\mathcal{A}$ is played. Therefore, some payoff is lost during the first block; the term $1/f_{n}$ in Equation (10) corresponds with this loss. On the other hand, by dividing the total stages into $f_{n}$ blocks we get blocks of length at least $n/f_{n}-1$ . This affects the precision of the simulation of the intended distribution of actions from the randomness source observed in previous block, which is reflected in the term

[TABLE]

This equation should be compared with (2), where the exponent of the simulation error is expressed as the product of three terms: the block length, a term $1-{1}/{\alpha}$ , and the entropy difference $H_{\alpha}(X|Y)-H_{\frac{1}{\alpha}}(A)$ . The term $n/f_{n}-1$ appears in Equation (11) as the block length (the lengths of each of the $f_{n}$ blocks is at least $n/f_{n}-1$ ). The sequence $h_{n}=\alpha-1$ is a proxy for the term $1-{1}/{\alpha}$ . Finally, considering the last term $H_{\alpha}(X|Y)-H_{\frac{1}{\alpha}}(A)$ , we see that larger entropy difference yields better simulation performance. On the other hand, requirement of a large entropy difference restricts the set of action distributions $A$ and results in a payoff loss. The sequence $g_{n}$ is responsible for this trade-off. Larger $g_{n}$ results in more loss in payoff (the term $g_{n}$ in Equation 10) but a more accurate simulation (the term $g_{n}$ in the exponent of the exponential term in Equation 10).

Next, consider the repeated game with non-causal leaked randomness source (see Remark 2), where the players observe the whole sequence of their corresponding randomness sources before the game starts. We claim the following result:

Theorem 9.

In the repeated game with non-causal leaked randomness source (as described in Remark 2), there exist real numbers $r>0$ , $\beta>0$ , $\gamma\geq 0$ and $\mu\geq 0$ with the following property: for arbitrary sequences of positive numbers $\{g_{n}\}_{n\in\mathbbm{N}}$ and $\{h_{n}\}_{n\in\mathbbm{N}}$ satisfying $g_{n}\leq r$ and $h_{n}\leq 1$ , there exists a sequence of strategies $\{\sigma^{n}\}_{n\in\mathbbm{N}}$ such that for all sequences of strategies $\{\tau^{n}\}_{n\in\mathbbm{N}}$ of Bob and for all $n\in\mathbbm{N}$ we have

[TABLE]

Proof of Theorem 9 is given in Section 3.4.

Corollary 10.

In the repeated game with non-causal leaked randomness source, for each $n\in\mathbbm{N}$ , let $v^{\prime}_{n}$ denote the max-min value of the $n$ -stage game. $v^{\prime}_{n}$ converges to $\mathcal{J}^{(A)}_{\text{cav}}(H(X|Y))$ with a rate of at least ${\sqrt{\log n}}/{\sqrt{n}}$ . To see this, let $r,\beta,\gamma,\mu$ be the values in the statement of Theorem 9, and let $k$ be an arbitrary positive number such that $\beta>k\gamma$ and $rk\leq 1$ . Define $g_{n}=\min\{r,(\frac{\log n}{k(\beta-k\gamma)n})^{\frac{1}{2}}\}$ , and $h_{n}=kg_{n}$ . Then, using similar calculations as in Corollary 7, Theorem 9 implies that there exists a sequence of strategies $\{\sigma^{n}\}_{n\in\mathbbm{N}}$ such that for all sequences of strategies $\{\tau^{n}\}_{n\in\mathbbm{N}}$ of Bob, and for all $n\in\mathbbm{N}$ , we have

[TABLE]

Theorem 6 and Theorem 9 provide a convergence rate for general games. However, in some special cases we can derive faster convergence rates for the max-min value of the game. The following theorem provides a special case in which an exponential convergence is obtained.

Theorem 11.

Let $q_{A}\in\Delta(\mathcal{A})$ be an equilibrium strategy for Alice in the one stage game, i.e.,

[TABLE]

If $H(X|Y)>H(q_{A})$ , then, in the repeated game with non-causal leaked randomness source, there exist real numbers $\beta,\gamma>0$ , and a sequence of strategies $\{\sigma^{n}\}_{n\in\mathbbm{N}}$ such that for all sequences of strategies $\{\tau^{n}\}_{n\in\mathbbm{N}}$ of Bob and for all $n\in\mathbbm{N}$ , we have

[TABLE]

The proof of Theorem 11 is provided in Section 3.5.

3.2 A technical tool: simulation of a source from another source

To prove the results of Section 3.1, we need a technical tool provided in this section. Here, we study the simulation of a desired single letter source $A\in\mathcal{A}$ from a given single letter source $X\in\mathcal{X}$ . We assume that $X$ is correlated with a side information $Y\in\mathcal{Y}$ , and we would like the generated source to be almost independent of the side information $Y$ . More precisely, we have the following definition:

Definition 12.

Let $(X,Y)\in\mathcal{X}\times\mathcal{Y}$ be distributed according to $p_{XY}$ , and $A\in\mathcal{A}$ be distributed according to $p_{A}$ . We say that the deterministic mapping $f:\mathcal{X}\to\mathcal{A}$ simulates $A$ from $X$ with precision $\epsilon$ if we have

[TABLE]

where $p_{f(X)Y}$ is the joint distribution of $f(X)$ and $Y$ .

According to the above definition, we are interested in a deterministic mapping that simulates $A$ from $X$ . However, we utilize the probabilistic method and random mappings, as a tool to ultimately prove existence of a suitable deterministic mapping. Therefore, we now define a random mapping and proceed by proving some properties for it. These properties will then lead to the construction of the desired deterministic mapping.

To specify a deterministic mapping $f:\mathcal{X}\to\mathcal{A}$ , we need to specify the value of $f(x)$ for all $x\in\mathcal{X}$ . To specify a random mapping $F:\mathcal{X}\to\mathcal{A}$ , we need to specify the joint distribution of the random variables $F(x)$ for $x\in\mathcal{X}$ .

Definition 13.

$F:\mathcal{X}\to\mathcal{A}$ * is a random mapping constructed as follows: assume that $F(x)$ for different values of $x$ are i.i.d. according to $p_{A}(a)$ . In other words, given string of symbols $a_{x}\in\mathcal{A}$ for all $x\in\mathcal{X}$ ,*

[TABLE]

The above construction of the random mapping $F$ defines a probability measure $p_{F}$ on the set of all mappings $f:\mathcal{X}\to\mathcal{A}$ denoted by $\mathcal{F}$ .

Lemma 14.

Let $(X,Y)\in\mathcal{X}\times\mathcal{Y}$ be distributed according to $p_{XY}$ and $A\in\mathcal{A}$ according to $p_{A}$ . Furthermore, let $F$ be the random mapping defined in Definition 13. Then,

[TABLE]

where $p_{f(X)Y}$ is the joint distribution of $f(X)$ and $Y$ . Consequently, there exists a deterministic mapping $f:\mathcal{X}\to\mathcal{A}$ such that for all $\alpha\in[1,2]$ , we have

[TABLE]

Proof of Lemma 14 is provided in Appendix A.

While the above inequality ensures the existence of a deterministic mapping $f:\mathcal{X}\to\mathcal{A}$ where (15) holds, it does not provide an explicit mapping $f$ . An explicit construction is desirable from an algorithmic perspective. In the following, we address this issue by showing that any randomly chosen mapping $f:\mathcal{X}\to\mathcal{A}$ would almost satisfy (15) with very high probability.

Let $D_{TV}=\|p_{F(X)Y}-p_{A}p_{Y}\|_{TV}$ . The quantity $D_{TV}$ is random because $F$ is random. Thus, random variable $D_{TV}$ is a function of the random variable $F$ , i.e., $D_{TV}$ takes value $\|p_{f(X)Y}-p_{A}p_{Y}\|_{TV}$ with probability $p_{F}(f)$ . Hence, Lemma 14 implies that for all $\alpha\in[1,2]$ ,

[TABLE]

We claim the following bound on how $D_{TV}$ concentrates around its expected value.

Proposition 15.

For the random variable $D_{TV}$ , we have

[TABLE]

Proof of Proposition 15 is presented in Appendix B.

One application of Proposition 15 is for simulation of i.i.d. sequences. Let $(X^{n},Y^{n})$ be i.i.d. according to $p_{XY}$ , and let $A^{n}$ be i.i.d. according to $p_{A}$ . Assume that $H(X|Y)>H(A)$ so that simulation of $A^{n}$ with arbitrary precision is possible. Let $F:\mathcal{X}^{n}\to\mathcal{A}^{n}$ be the random mapping of Definition 13, where $(X,Y,A)$ is replaced with $(X^{n},Y^{n},A^{n})$ . Let us choose $\alpha>1$ such that $H_{\alpha}(X|Y)>H_{\frac{1}{\alpha}}(A)$ (note that such a real number exists since $H(X|Y)>H(A)$ , and Rényi entropy converges to Shannon entropy as $\alpha$ tends to $1$ ). Let $\epsilon$ be a positive number such that

[TABLE]

Then, Lemma 14 implies

[TABLE]

where $D_{TV}=\|p_{F(X^{n})Y^{n}}-p_{A^{n}}p_{Y^{n}}\|_{TV}$ . Furthermore, from Proposition 15, for $t=2^{-\epsilon n}$ , we have

[TABLE]

The above equation along with Equation (16) and definition $\delta=H_{2}(X)-2\epsilon$ implies

[TABLE]

In other words, the outcome of the random mapping $F$ , with probability at least $1-2e^{-2^{\delta n}}$ (converging double exponentially to $1$ ) will simulate $A^{n}$ with precision at most $2\times 2^{-\epsilon n}$ (decaying exponentially in $n$ ).

3.3 Proof of Theorem 6

Let us divide the total stages, $n$ , into $f_{n}$ blocks, where $\{f_{n}\}_{n\in\mathbbm{N}}$ is the arbitrary sequence of natural numbers in the statement of the theorem. Let $k_{n}$ be the remainder of $n$ divided by $f_{n}$ , i.e., $n=\lfloor n/f_{n}\rfloor f_{n}+k_{n}$ . Then, the number of stages in each block, $\{N_{n,i}\}_{i=1}^{f_{n}}$ , is computed as follows:

[TABLE]

In other words, first, all blocks get $\lfloor n/f_{n}\rfloor$ stages, then, the remaining $k_{n}$ stages are assigned to the first $k_{n}$ blocks.

Let $A_{i}^{N_{n,i}}=(A_{i,1},A_{i,2},\dots,A_{i,N_{n,i}})$ and $B_{i}^{N_{n,i}}=(B_{i,1},B_{i,2},\dots,B_{i,N_{n,i}})$ denote the sequence of actions played in block $i=1,\dots,f_{n}$ by Alice and Bob, respectively. Similarly, let $X_{i}^{N_{n,i}}=(X_{i,1},X_{i,2},\dots,X_{i,N_{n,i}})$ and $Y_{i}^{N_{n,i}}=(Y_{i,1},Y_{i,2},\dots,Y_{i,N_{n,i}})$ denote the sequence of random sources observed in block $i$ by Alice and Bob, respectively. We generate strategy $\sigma^{n}$ for Alice as follows: in all stages of the first block, Alice chooses an arbitrary action $a\in\mathcal{A}$ ; in each block $i\geq 2$ , Alice chooses her action sequence $A_{i}^{N_{n,i}}$ as a deterministic function of the sequence of random sources observed during the previous block, $X_{i-1}^{N_{n,i-1}}$ . Let us denote this deterministic function by $\varphi_{i}$ . Thus, we have

[TABLE]

In order to fulfill the definition of the strategy $\sigma^{n}$ , it suffices to determine the functions $\varphi_{i}$ for $i=2,\dots,f_{n}$ . We will now determine the functions $\varphi_{i}$ after presenting some preliminaries.

Considering the definition of the function $\mathcal{J}^{(A)}_{\text{cav}}(.)$ , there exist real number $0\leq r\leq 1$ and pmfs $p_{A}^{(1)},p_{A}^{(2)}\in\Delta(\mathcal{A})$ such that:

[TABLE]

Without loss of generality, we may assume that $H(p_{A}^{(1)})\geq H(p_{A}^{(2)})$ . The following lemma claims that we may assume that $r$ , $p_{A}^{(1)}$ and $p_{A}^{(2)}$ also satisfy the following equations:

[TABLE]

Lemma 16.

Theorem 6 holds if Equations (20) and (21) fail to hold.

Proof of the above lemma is given later in Section 3.3.1.

We identify the value of $r$ in the statement of the theorem as the one given by Equations (18) and (19). The values for $\beta>0$ , $\gamma\geq 0$ and $\mu\geq 0$ will be identified later. Take an arbitrary sequence $\{g_{n}\}_{n\in\mathbbm{N}}$ of positive numbers, as in the statement of the theorem, such that $g_{n}\leq r$ , for all $n\in\mathbbm{N}$ . Let

[TABLE]

Moreover, consider an ideal distribution $q_{A_{i}^{N_{n,i}}}$ defined as follows for $i=2,\dots,f_{n}$ :

[TABLE]

For each $i=2,\dots,f_{n}$ , we choose $\varphi_{i}$ to be the mapping of Lemma 14 that simulates $q_{A_{i}^{N_{n,i}}}$ from $X_{i-1}^{N_{n,i-1}}$ ; hence, for all $1\leq\alpha\leq 2$ , we have

[TABLE]

where $p_{A_{i}^{N_{n,i}}Y_{i-1}^{N_{n,i-1}}}$ is the joint pmf of $A_{i}^{N_{n,i}}$ and $Y_{i-1}^{N_{n,i-1}}$ . Next, note that

[TABLE]

On the other hand, since $(X_{i-1}^{N_{n,i-1}},Y_{i-1}^{N_{n,i-1}})$ are drawn i.i.d. from $p_{XY}$ , we have

[TABLE]

where we used $N_{n,i-1}\geq N_{n,i}$ , which follows from the definition given in Equation (17). Moreover, let $r_{n,i}$ be a fractional approximation of $r$ defined as follows

[TABLE]

Observe that

[TABLE]

Equations (23), (24) and (25) imply

[TABLE]

Using Equations (3)-(6), we bound the exponent of the exponential term in the right-hand side of the above equation as below:

[TABLE]

where in (27) we used the fact that $0\leq r_{n,i}\leq 1$ and $\alpha\geq 1$ . On the other hand, Equations (19) and (20) along with the fact that $r_{n,i}\leq r-g_{n}$ imply

[TABLE]

where

[TABLE]

Next, let us define

[TABLE]

Then, Equations (27) and (28) imply

[TABLE]

where (29) results from $\alpha\leq 2$ . By using (29) in (26), and simplifications $1-1/\alpha\geq(\alpha-1)/2$ and $N_{n,i}\geq n/f_{n}-1$ we obtain

[TABLE]

Next, let $\alpha=1+h_{n}$ , where $\{h_{n}\}_{n\in\mathbbm{N}}$ is the arbitrary sequence of positive real numbers in the statement of the theorem. Then, Equation (30) results in

[TABLE]

Now, we need to include the sequence of actions of Bob at the $i$ -th block ( $B_{i}^{N_{n,i}}$ ) into Equation (31). To do so, note that $A_{i}^{N_{n,i}}$ is independent of Alice’s actions in all blocks, except for the $i$ -th block. This is because the $X$ -source is i.i.d. and $A_{i}^{N_{n,i}}$ is a function of $X_{i-1}^{N_{n,i-1}}$ . Therefore, at $t$ -th stage of block number $i$ , Bob obtains information about $X_{i-1}^{N_{n,i-1}}$ only through his source $Y_{i-1}^{N_{n,i-1}}$ and prior actions $A_{i-1}^{t-1}$ . In other words, $B_{i,t}$ is conditionally independent of $X_{i-1}^{N_{n,i-1}}$ given $Y_{i-1}^{N_{n,i-1}},A_{i}^{t-1},B_{i}^{t-1}$ . Since $A_{i}^{N_{n,i}}=\varphi_{i}(X_{i-1}^{N_{n,i-1}})$ , $B_{i,t}$ is also conditionally independent of $A_{i,t},A_{i,t+1},\cdots,A_{i,N_{n,i}}$ given $Y_{i-1}^{N_{n,i-1}},A_{i}^{t-1},B_{i}^{t-1}$ . Thus,

[TABLE]

Then, utilizing the first property of total variation in Lemma 1 for random variables $E=(A_{i}^{N_{n,i}},Y_{i-1}^{N_{n,i-1}})$ and $F=B_{i}^{N_{n,i}}$ we conclude from (31) that

[TABLE]

Next, by utilizing the second property of total variation in Lemma 1 for random variables $E=(A_{i}^{N_{n,i}},B_{i}^{N_{n,i}})$ and $F=Y_{i-1}^{N_{n,i-1}}$ , and replacing $q_{A_{i}^{N_{n,i}}}$ from Equation (22) we conclude

[TABLE]

In other words, the distribution of the generated actions $A_{i}^{N_{n,i}}$ is in distance $\delta_{n}$ from the ideal distribution $q_{A_{i}^{N_{n,i}}}$ . Note that the ideal distribution $q_{A_{i}^{N_{n,i}}}$ secures payoff $m_{n,i}U^{(A)}(p_{A}^{(1)})+(N_{n,i}-m_{n,i})U^{(A)}(q_{A}^{(2)})$ in the $i$ -th block. Therefore, in the $i$ -th block, the generated strategy $\sigma^{n}$ secures payoff

[TABLE]

where $\mathsf{M}=\max_{a\in\mathcal{A},b\in\mathcal{B}}|u_{ab}|$ . Thus, for arbitrary strategy $\tau^{n}$ of Bob we have

[TABLE]

where $\Delta U=|U^{(A)}(p_{A}^{(1)})-U^{(A)}(p_{A}^{(2)})|$ , and inequality (33) follows from Equation (18) and the fact that $|m_{n,i}-rN_{n,i}|\leq g_{n}N_{n,i}+1$ ; Equation (34) is implied by $\sum_{i=1}^{f_{n}}N_{n,i}=n$ , and (35) results from $\mathcal{J}^{(A)}_{\text{cav}}(H(X|Y))\leq\mathsf{M}$ and $N_{n,1}\leq n/f_{n}+1$ .

Note that Equation (20) implies that $\beta>0$ ; therefore, by replacing the value of $\delta_{n}$ , and defining $\mu=\max\{2\mathsf{M},\Delta U\}$ , (35) implies the claim of the theorem. $~{}~{}~{}~{}\qquad\qquad\qed$

3.3.1 Proof of Lemma 16

We need to consider the case of $r=0$ or $H(p_{A}^{(1)})=H(p_{A}^{(2)})$ .

The case of $r=0$ and $H(p_{A}^{(2)})=0$ : here, $p_{A}^{(2)}$ is deterministic (it outputs an action $a\in\mathcal{A}$ with probability $1$ ), and hence, the trivial strategy of playing $a$ in all stages secures payoff $\mathcal{J}^{(A)}_{\text{cav}}(H(X|Y))$ for Alice; therefore, in this case, the claim of the theorem holds with $\mu=0$ .

2.

The case of $r=0$ and $H(p_{A}^{(2)})>0$ : in this case, let $r^{\prime}=1$ , $q_{A}^{(1)}=p_{A}^{(2)}$ , and let $q_{A}^{(2)}$ be an arbitrary deterministic pmf. Then, $r^{\prime}$ , $q_{A}^{(1)}$ and $q_{A}^{(2)}$ satisfy Equations (18)-(21). Therefore, we can proceed with the proof of Theorem 6 with these assumptions.

3.

If $H(p_{A}^{(1)})=H(p_{A}^{(2)})=0$ , then Alice can achieve $\mathcal{J}^{(A)}_{\text{cav}}(H(X|Y))$ by playing a pure action, and the claim of the theorem holds with $\mu=0$ .

4.

If $r=1$ and $H(p_{A}^{(1)})=H(p_{A}^{(2)})>0$ , then, we can change $p_{A}^{(2)}$ to an arbitrary deterministic pmf so that Equations (18)-(21) hold. Therefore, we can proceed with the proof of Theorem 6 with these assumptions.

5.

If $0<r<1$ and $H(p_{A}^{(1)})=H(p_{A}^{(2)})>0$ , then, we can change $r$ to $r=1$ , and $p_{A}^{(2)}$ to a deterministic pmf such that Equations (18)-(21) hold. This is because $0<r<1$ and $H(p_{A}^{(1)})=H(p_{A}^{(2)})>0$ imply that $U^{(A)}(p_{A}^{(1)})=U^{(A)}(p_{A}^{(2)})$ , since otherwise, by changing $r$ we would get greater value for $\mathcal{J}^{(A)}_{\text{cav}}(H(X|Y))$ , which contradicts the definition of the upper concave envelope.

3.4 Proof of Theorem 9

The proof is similar to the proof of Theorem 6 with few modifications. More specifically, in a repeated game with non-causal leaked randomness source we do not need to divide the total $n$ stages into blocks; instead, we can generate all actions of Alice as a function of the whole randomness source. Let $A^{n}=(A_{1},A_{2},\dots,A_{n})$ and $B^{n}=(B_{1},B_{2},\dots,B_{n})$ denote the sequences of actions of Alice and Bob, and let $X^{n}=(X_{1},X_{2},\dots,X_{n})$ and $Y^{n}=(Y_{1},Y_{2},\dots,Y_{n})$ denote the sequences of random sources of Alice and Bob, respectively. We generate strategy $\sigma^{n}$ for Alice such that Alice chooses her action sequence $A^{n}$ as a deterministic function of $X^{n}$ , i.e.,

[TABLE]

We will now determine the function $\varphi_{n}$ after presenting some preliminaries.

As stated in the proof of Theorem 6 in Section 3.3, we assume that there exist real number $r$ and pmfs $p_{A}^{(1)},p_{A}^{(2)}\in\Delta(\mathcal{A})$ satisfying (18)-(21). Moreover, let

[TABLE]

and let $q_{A^{n}}$ be an ideal distribution of actions defined as follows:

[TABLE]

We choose $\varphi_{n}$ to be the mapping of Lemma 14 that simulates $q_{A^{n}}$ from $X^{n}$ ; hence, for all $1\leq\alpha\leq 2$ we have

[TABLE]

Next, note that $H_{1/\alpha}\left(q_{A^{n}}\right)=m_{n}H_{1/\alpha}\left(p^{(1)}_{A}\right)+(n-m_{n})H_{1/\alpha}\left(p^{(2)}_{A}\right)$ , and $H_{\alpha}\left(X^{n}|Y^{n}\right)=nH_{\alpha}\left(X|Y\right)$ . Thus, defining $r_{n}=m_{n}/n$ , Equation 37 implies

[TABLE]

A similar argument as the one used to prove Equation (29) in Section 3.3 implies

[TABLE]

where $\beta=H(p_{A}^{(1)})-H(p_{A}^{(2)})$ , and

[TABLE]

By using (39) in (38), and simplification $1-1/\alpha\geq(\alpha-1)/2$ , we obtain

[TABLE]

Next, let $\alpha=1+h_{n}$ , where $\{h_{n}\}_{n\in\mathbbm{N}}$ is the arbitrary sequence of positive real numbers in the statement of the theorem; hence, Equation (40) results in

[TABLE]

Now, we need to include the sequence of actions of Bob ( $B^{n}$ ) into Equation (41). Note that at each stage $t$ , Bob has access to information $(Y^{t-1},A^{t-1},B^{t-1})$ ; thus, given an arbitrary strategy $\tau^{n}$ for Bob, we have

[TABLE]

Then, using a similar argument as we used in Section 3.3 to prove Equation 32, the above equation along with Equation (41) implies

[TABLE]

In other words, the distribution of the generated actions is in distance $\delta^{\prime}_{n}$ from the ideal distribution. Note that the ideal distribution $q_{A^{n}}$ secures payoff $m_{n}U^{(A)}(p_{A}^{(1)})+(n-m_{n})U^{(A)}(q_{A}^{(2)})$ . Therefore, we have

[TABLE]

where $\mathsf{M}=\max_{a\in\mathcal{A},b\in\mathcal{B}}|u_{ab}|$ , $\Delta U=|U^{(A)}(p_{A}^{(1)})-U^{(A)}(p_{A}^{(2)})|$ , and the second inequality follows from Equation (18) along with the fact that $|m_{n}-rn|\leq g_{n}n+1$ . By replacing $\delta^{\prime}_{n}$ and defining $\mu=\max\{2\mathsf{M},\Delta U\}$ , we obtain the claim of the theorem.

3.5 Proof of Theorem 11

The inequality $H(X|Y)>H(q_{A})$ along with the fact that $q_{A}$ is an equilibrium strategy for Alice in the stage game implies that

[TABLE]

If Alice could play i.i.d. according to $q_{A}$ , she would have secured payoff $U^{(A)}(q_{A})$ . Our goal is to generate the actions of Alice, $A^{n}$ , as a deterministic function of the randomness source $X^{n}$ in such a way that at every stage $t$ , the distribution of the action $A_{t}$ is almost $q_{A}$ and is almost independent of the past observations of Bob.

The strategy $\sigma^{n}$ is defined as follows: the actions $A^{n}$ are chosen as a deterministic function of $X^{n}$ , i.e., $A^{n}=\varphi_{n}(X^{n})$ . We will now define the mapping $\varphi_{n}$ . Consider an ideal distribution $q_{A^{n}}$ defined as below:

[TABLE]

Let $\varphi_{n}$ be the mapping of Lemma 14 that simulates $q_{A^{n}}$ from $X^{n}$ ; hence, for all $1\leq\alpha\leq 2$ , we have

[TABLE]

where $p_{A^{n}Y^{n}}$ is the joint pmf of $A^{n}$ and $Y^{n}$ . Note that $(X^{n},Y^{n})$ are drawn i.i.d. from $p_{XY}$ , and $q_{A^{n}}$ is i.i.d. as well, thus, we have

[TABLE]

Furthermore, let $\beta$ be defined as follows

[TABLE]

Note that $\lim_{\alpha\to 1}\left(H_{\alpha}(X|Y)-H_{1/\alpha}(q_{A})\right)=H(X|Y)-H(q_{A})>0$ ; hence, $\beta>0$ . Equations (45) and (46) along with the above definition of $\beta$ imply

[TABLE]

Next, let Bob play an arbitrary strategy $\tau^{n}$ and let $B^{n}$ denote the sequence of actions of Bob. At stage $t$ , Bob generates $B_{t}$ as a function of $Y^{n}$ and his previous observations $A^{t-1}$ and $B^{t-1}$ . Hence, we have

[TABLE]

Then, utilizing the first property of total variation in Lemma 1 for random variables $E=(A^{n},Y^{n})$ and $F=B^{n}$ , we conclude from (47) that

[TABLE]

Next, by utilizing the second property of total variation in Lemma 1 for random variables $E=(A^{n},B^{n})$ and $F=Y^{n}$ , and replacing $q_{A^{n}}$ from Equation (44) we conclude

[TABLE]

In other words, the distribution of the generated actions $p_{A^{n}B^{n}}$ is in distance $2^{-\beta n}$ from the ideal distribution $\prod_{t=1}^{n}q_{A}(a_{t})p_{B_{t}|A^{t-1}B^{t-1}}(b_{t}|a^{t-1},b^{t-1})$ . Note that the ideal distribution secures payoff $U^{(A)}(q_{A})$ for Alice. Therefore, Equation (48) implies that

[TABLE]

where $\mathsf{M}=\max_{a\in\mathcal{A},b\in\mathcal{B}}|u_{ab}|$ . Note that $\tau^{n}$ is an arbitrary strategy for Bob, therefore, the above inequality along with (43) implies the claim of the theorem.

4 Approximate Nash equilibria of the repeated game with leaked randomness source

In the repeated game with leaked randomness source defined in Section 3.1, we have forced the players to randomize their actions just by conditioning them to the outcomes of the random sources $X^{n}$ and $Y^{n}$ . In this setting, Nash equilibria do not necessarily exist (See Hubáček et al. (2016) and Budinich and Fortnow (2011)). However, approximate Nash equilibria may exist. The goal of this section is to characterize the set of approximate Nash equilibria achievable by the randomness sources $X^{n}$ and $Y^{n}$ . To proceed, consider the following definitions.

Definition 17.

In the $n$ stage repeated game, the strategy profile $(\sigma^{n},\tau^{n})$ is an $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium if Alice (resp. Bob) can not increase (resp. decrease) the expected average payoff (defined in Equation (7)) more than $\epsilon_{A}$ (resp. $\epsilon_{B}$ ) by changing her (resp. his) strategy unilaterally.

Definition 18.

We say $v$ is a $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium payoff if for arbitrary $\delta>0$ there exists a natural number $n_{0}$ and a sequence of strategy profiles $\{(\sigma^{n},\tau^{n})\}_{n\in\mathbbm{N}}$ such that for all $n\geq n_{0}$ , $(\sigma^{n},\tau^{n})$ forms a $(\epsilon_{A}+\delta,\epsilon_{B}+\delta)$ -Nash equilibrium, and $|\lambda(\sigma^{n},\tau^{n})-v|\leq\delta$ .

Definition 19.

$(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium is achievable in long run if for all $\delta>0$ , there exists a natural number $n_{0}$ such that for all $n\geq n_{0}$ , in the $n$ -stage repeated game, there exists a $(\epsilon_{A}+\delta,\epsilon_{B}+\delta)$ -Nash equilibrium.

We will now characterize the set of all approximate Nash equilibria of the repeated game with leaked randomness source. To do so, we first need to comment on the long run security level of the players. As stated in Theorem 5, Alice can secure arbitrary payoff $v$ in long run if and only if $v\leq\mathcal{J}_{cav}^{(A)}(H(X|Y))$ , where $\mathcal{J}_{cav}^{(A)}(.)$ is the upper concave envelope of $\mathcal{J}^{(A)}(.)$ defined in Definition 4. Using Theorem 5, we can derive a similar result from Bob’s (the minimizer) point of view. Consider the following definitions:

Definition 20.

Let $v$ be an arbitrary real value:

Bob can secure $v$ in the $n$ stage repeated game if there exists a strategy $\tau^{n}$ for Bob such that for all strategy $\sigma^{n}$ of Alice we have $\lambda(\sigma^{n},\tau^{n})\leq v$ .

2.

Bob can secure $v$ in long run if there exists a sequence of strategies $\{\tau^{n}\}_{n\in\mathbbm{N}}$ for Bob such that for all sequences of strategies $\{\sigma^{n}\}_{n\in\mathbbm{N}}$ of Alice we have $\limsup_{n\to\infty}\lambda(\sigma^{n},\tau^{n})\leq v$ .

Definition 21.

In a stage game, the security level of mixed action $p_{B}$ for Bob is denoted by $U^{(B)}(p_{B})$ , and is defined as follows:

[TABLE]

Furthermore, the minimum cost that Bob can secure in a stage game, by playing mixed actions of entropy at most $\mathsf{h}$ , is denoted by $\mathcal{J}^{(B)}(\mathsf{h})$ , and is defined as:

[TABLE]

Next, by replacing the stage payoff $u_{ab}$ with $-u_{ab}$ , and hence, considering Bob as the maximizer, we can deduce the following corollary of Theorem 5.

Corollary 22.

Let $\mathcal{J}^{(B)}_{\text{vex}}(\mathsf{h})$ be the lower convex envelope of $\mathcal{J}^{(B)}(\mathsf{h})$ defined in Definition 21. In the repeated game with leaked randomness source, Bob can secure $v$ in long run if and only if $v\geq\mathcal{J}^{(B)}_{\text{vex}}(H(Y|X))$ . Furthermore, in $n\in\mathbbm{N}$ stage game, Bob can secure $v$ only if $v\geq\mathcal{J}^{(B)}_{\text{vex}}(H(Y|X))$ .

Note that the functions $\mathcal{J}^{(A)}(\mathtt{h})$ and $\mathcal{J}^{(B)}(\mathtt{h})$ are respectively increasing and decreasing in $\mathtt{h}$ . On the other hand, the minimax theorem (Von Neumann (1928)) implies that $\mathcal{J}^{(A)}(+\infty)=\mathcal{J}^{(B)}(+\infty)$ ; thus, for arbitrary $\mathtt{h}$ and $\mathtt{h}^{\prime}$ , we have $\mathcal{J}^{(B)}(\mathtt{h})\geq\mathcal{J}^{(A)}(\mathtt{h}^{\prime})$ . Hence, $\mathcal{J}^{(B)}_{vex}(H(Y|X))\geq\mathcal{J}^{(A)}_{cav}(H(X|Y))$ .

In the following theorem we characterize the set of achievable $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium payoffs in terms of the individually secured payoffs $\mathcal{J}^{(A)}_{cav}(H(X|Y))$ and $\mathcal{J}^{(B)}_{vex}(H(Y|X))$ .

Theorem 23.

In the repeated game with leaked randomness source defined in Section 3.1, $v$ is a $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium payoff if and only if

[TABLE]

where $\overline{m}$ and $\underline{m}$ are the maximum and minimum entries of the payoff table, respectively ( $\overline{m}=\max_{(a,b)\in\mathcal{A}\times\mathcal{B}}u_{ab}$ , and $\underline{m}=\min_{(a,b)\in\mathcal{A}\times\mathcal{B}}u_{ab}$ ).

The proof of Theorem 23 is provided in Section 4.1.

Remark 24.

If $\epsilon_{A}=\epsilon_{B}=0$ , the set of payoffs satisfying (51) is empty unless $H(X|Y)$ and $H(Y|X)$ are large enough such that

[TABLE]

where $v^{*}=\max_{p_{A}\in\Delta(\mathcal{A})}\min_{p_{B}\in\Delta(\mathcal{B})}\sum_{a\in\mathcal{A},b\in\mathcal{B}}p_{A}(a)p_{B}(b)u_{ab}$ . In this case, if (52) holds, the only equilibrium payoff is $v^{*}$ , i.e., the max-min value of the stage game. This particular result coincides with the result of ”Folk-Theorem” for two-player zero-sum repeated games in which players can freely randomize their actions.

We can refine Theorem 23 to characterize the set of $\epsilon_{A}$ and $\epsilon_{B}$ for which $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium is achievable in long run. Let $\epsilon_{A}$ and $\epsilon_{B}$ be arbitrary positive numbers. If there exists a $v$ satisfying Equation (51), then, Theorem 23 implies that $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium is achievable. On the other hand, let $(\sigma^{n},\tau^{n})$ form an $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium; then, Theorem 23 implies that $v=\lambda(\sigma^{n},\tau^{n})$ satisfies (51). In other words, $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium is achievable if and only if there exists a real number $v$ satisfying (51). Therefore, by removing $v$ from Equation (51), and rewriting it in terms of $\epsilon_{A}$ and $\epsilon_{B}$ , we conclude the following corollary of Theorem 23.

Corollary 25.

In the repeated game with leaked randomness source defined in Section 3.1, $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium is achievable in long run if and only if

[TABLE]

4.1 Proof of Theorem 23

We prove that inequality (51) is both necessary and sufficient for $v$ to be a $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium payoff.

Inequality (51) is necessary: In the $n$ stage repeated game, let $\sigma^{n}$ and $\tau^{n}$ be arbitrary strategies for Alice and Bob generating an $(\epsilon^{\prime}_{A},\epsilon^{\prime}_{B})$ -Nash equilibrium. According to Corollary 22, given the strategy $\tau^{n}$ for Bob, there exists a strategy $\sigma^{*n}$ for Alice such that $\lambda(\sigma^{*n},\tau^{n})\geq\mathcal{J}^{(B)}_{vex}(H(Y|X))$ . Hence,

[TABLE]

But $(\sigma^{n},\tau^{n})$ is a $(\epsilon^{\prime}_{A},\epsilon^{\prime}_{B})$ -Nash equilibrium, thus, we should have

[TABLE]

Similarly, Theorem 5 implies that given the strategy $\sigma^{n}$ for Alice, there exists a strategy $\tau^{*n}$ for Bob such that $\lambda(\sigma^{n},\tau^{*n})\leq\mathcal{J}^{(A)}_{cav}(H(X|Y))$ . Hence,

[TABLE]

But $(\sigma^{n},\tau^{n})$ is a $(\epsilon^{\prime}_{A},\epsilon^{\prime}_{B})$ -Nash equilibrium, thus, we should have

[TABLE]

On the other hand, since $\lambda(\sigma^{n},\tau^{n})$ is a convex combination of the entries of the payoff table, we have $\underline{m}\leq\lambda(\sigma^{n},\tau^{n})\leq\overline{m}$ ; this fact along with Equations (54) and (55) implies that

[TABLE]

For $v$ to be achievable, the above relation should be satisfied for $\lambda(\sigma^{n},\tau^{n})$ , $\epsilon^{\prime}_{A}$ and $\epsilon^{\prime}_{B}$ arbitrarily close to $v$ , $\epsilon_{A}$ and $\epsilon_{B}$ , respectively. Thus, Equation (51) must hold.

Inequality (51) is sufficient: Let $v$ , $\epsilon_{A}\geq 0$ and $\epsilon_{B}\geq 0$ be real numbers satisfying (51). Equation (51) implies that $\underline{m}\leq v\leq\overline{m}$ ; hence, $v$ can be expressed as a convex combination of the entries of the payoff table; i.e., there exist action profiles $(a_{1},b_{1}),(a_{2},b_{2}),\dots,(a_{r},b_{r})\in\mathcal{A}\times\mathcal{B}$ , and non-negative numbers $\alpha_{1},\alpha_{2},\dots,\alpha_{r}$ summing to one such that

[TABLE]

Let us approximate each $\alpha_{i}$ by a rational number $k_{i}/K$ such that $\sum_{i=1}^{r}k_{i}=K$ ; for arbitrary $\delta>0$ , we can choose $K$ large enough such that

[TABLE]

where $\hat{v}=\sum_{i=1}^{r}\frac{k_{i}}{K}u_{a_{i}b_{i}}$ . We take $K$ so large that not only inequality (56) is satisfied, but also there exist strategies $\sigma^{*K}$ and $\tau^{*K}$ such that in the $K$ -stage repeated game, $\sigma^{*K}$ secures expected average payoff of $\mathcal{J}^{(A)}_{cav}(H(X|Y))-\delta$ for Alice, and $\tau^{*K}$ secures $\mathcal{J}^{(B)}_{vex}(H(Y|X))+\delta$ for Bob (Such strategies $\sigma^{*K}$ and $\tau^{*K}$ exist since in long run, Alice can secure $\mathcal{J}^{(A)}_{cav}(H(X|Y))$ , and Bob can secure $\mathcal{J}^{(B)}_{vex}(H(Y|X))$ ).

Now, we are ready to construct the desired approximate Nash equilibrium $(\sigma^{n},\tau^{n})$ . Let the total stages of the game be of the form $n=NK$ , and let us divide the total $n$ stages into $N$ blocks of length $K$ . The value of $N$ will be set in the sequel. In each block, Alice and Bob cycle through the action profiles $(a_{1},b_{1}),\dots,(a_{r},b_{r})$ such that each action profile $(a_{i},b_{i})$ is repeated in $k_{i}$ stages. Note that the actions $(a_{1},b_{1}),\dots,(a_{r},b_{r})$ are deterministic, thus, each player can monitor the actions of the other player to see if he/she is still following the rule or not. If Alice (resp. Bob) deviates the rule, then, in the upcoming blocks, Bob (resp. Alice) plays according to the strategy $\tau^{*K}$ (resp. $\sigma^{*K}$ ) to secure payoff $\mathcal{J}^{(B)}_{vex}(H(Y|X))+\delta$ (resp. $\mathcal{J}^{(A)}_{cav}(H(X|Y))-\delta$ ).

When Alice and Bob both play according to respective strategies $\sigma^{n}$ and $\tau^{n}$ , the expected average payoff equals $\hat{v}$ , i.e.,

[TABLE]

Next, we show that the strategy profile $(\sigma^{n},\tau^{n})$ forms the desired approximate Nash equilibrium. Let Alice deviate from strategy $\sigma^{n}$ , and play an arbitrary strategy $\sigma^{\prime n}$ . Furthermore, let the deviation be detected by Bob at block $j\in\{1,2,\dots,N\}$ . The expected average payoff will be $\hat{v}$ in the blocks before the $j$ -th block, and the payoff of the blocks after the $j$ -th block (where Bob plays $\tau^{*K}$ ) will be at most $\mathcal{J}^{(B)}_{vex}(H(Y|X))+\delta$ . In the $j$ -th block, Alice could get at most $\mathsf{M}=\max_{(a,b)\in\mathcal{A}\times\mathcal{B}}|u_{ab}|$ , thus,

[TABLE]

Equation (58) along-with Equation (57) implies:

[TABLE]

where (59) follows from Equation (56), and the fact that $\hat{v}\geq-\mathsf{M}$ . On the other hand, Equation 51 implies:

[TABLE]

Equations (59) and (60) imply:

[TABLE]

By a similar argument we can also show that for arbitrary strategy $\tau^{\prime n}$ for Bob we have

[TABLE]

Inequalities (61) and (62) imply that the strategy profile $(\sigma^{n},\tau^{n})$ forms a $(\epsilon_{A}+2\mathsf{M}/N+2\delta,\epsilon_{B}+2\mathsf{M}/N+2\delta)$ -Nash equilibrium with expected average payoff $\hat{v}$ . We can choose $\delta$ small enough and $N$ large enough to make $2\mathsf{M}/N+2\delta$ as small as desired, and hence, $\hat{v}$ as close to $v$ as desired (according to (56)). Thus, $v$ is an $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium payoff.

4.2 Approximate Nash equilibria achieved by autonomous strategies

We call a strategy an autonomous strategy if the action of each stage is indifferent about the actions of the opponent in the previous stages. Formally, in the $n$ stage repeated game with leaked randomness sources defined in Section 3.1, strategies $\sigma^{n}=(\sigma_{1},\dots,\sigma_{n})$ and $\tau^{n}=(\tau_{1},\dots,\tau_{n})$ are autonomous if for arbitrary $t\in\{1,\dots,n\}$ and arbitrary histories $a^{t-1},\tilde{a}^{t-1}\in\mathcal{A}^{t-1}$ , $b^{t-1},\tilde{b}^{t-1}\in\mathcal{B}^{t-1}$ , $x^{t}\in\mathcal{X}^{t}$ and $y^{t}\in\mathcal{Y}^{t}$ we have

[TABLE]

Autonomous strategies are sufficient for construction of a Nash equilibrium for two-player zero-sum repeated games, where players can freely randomize their actions. Furthermore, in the repeated game with leaked randomness source, the maximum securable payoff of Alice (the max-min payoff) can be secured by an autonomous strategy (see (Valizadeh and Gohari, 2017, Section 3.3)). Therefore, we are also interested in the set of approximate Nash equilibria achievable by the class of autonomous strategies.

In this section, we characterize the set of approximate Nash equilibria achievable by autonomous strategies in a simplified version of the repeated game with leaked randomness source. In the simplified version, we assume that the randomness sources $X^{n}$ and $Y^{n}$ are independent, i.e., $p_{XY}=p_{X}p_{Y}$ , thus, we call it the repeated game with independent randomness sources. It will turn out that the set of approximate Nash equilibria achievable by autonomous strategies is strictly smaller than the set of all achievable approximate Nash equilibria in Corollary 25. To proceed we need the following definition.

Definition 26.

$(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium is achievable by autonomous strategies if for all $\delta>0$ there exists a natural number $n_{0}$ and a sequence of autonomous strategy profiles $\{(\sigma^{n},\tau^{n})\}_{n\in\mathbbm{N}}$ such that for all $n\geq n_{0}$ , $(\sigma^{n},\tau^{n})$ forms a $(\epsilon_{A}+\delta,\epsilon_{B}+\delta)$ -Nash equilibrium in the $n$ stage repeated game.

Theorem 27.

In the repeated game with independent randomness sources, $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium is achievable by autonomous strategies if and only if there exist random variables $A\in\mathcal{A}$ , $B\in\mathcal{B}$ and $Q\in\{0,1,2,3\}$ such that $p_{ABQ}(a,b,q)=p_{Q}(q)p_{A|Q}(a|q)p_{B|Q}(b|q)$ and

[TABLE]

where $g_{A}(A,B|Q)$ and $g_{B}(A,B|Q)$ are defined as follows

[TABLE]

Proof of Theorem 27 is provided in Appendix C.

Example 28.

Consider a repeated game with independent randomness sources $X^{n}$ and $Y^{n}$ such that $H(X)=0$ , and $H(Y)=1$ . The sets of actions of Alice and Bob are $\mathcal{A}=\mathcal{B}=\{0,1\}$ , and the payoff table is as follows:

[TABLE]

Since $H(X)=0$ , Alice must play deterministic actions by which she can secure at most $-1$ ; hence, $\mathcal{J}^{(A)}_{cav}(H(X|Y))=-1$ . On the other hand, Bob has access to one bit randomness per stage, thus, in each stage, he can play according to the max-min strategy of the one shot game and secure [math], hence, $\mathcal{J}^{(B)}_{vex}(H(Y|X))=0$ . Consequently, Corollary 25 implies that $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium is achievable if and only if $\epsilon_{A}+\epsilon_{B}\geq 1$ . Hence, $(1/2,1/2)$ -Nash equilibrium is achievable. It is straightforward to check that $\epsilon_{A}=1/2$ and $\epsilon_{B}=1/2$ does not satisfy the conditions of Theorem 27; therefore, in the repeated game of this example, $(1/2,1/2)$ -Nash equilibrium is not achievable by autonomous strategies.

Remark 29.

In the repeated game of Example 28, the set of approximate Nash equilibria achievable by autonomous strategies is strictly smaller than the set of approximate Nash equilibria achievable by arbitrary strategies. Therefore, for achieving approximate equilibria of the repeated games with leaked randomness source, autonomous strategies are not sufficient.

Appendix Appendix A Proof of Lemma 14

For arbitrary $\alpha\in[1,2]$ we have

[TABLE]

where $\mathbbm{1}(.)$ is the indicator function, and Equation (A.1) follows from

[TABLE]

and reordering the summations. Equation (A.2) follows from $\sqrt[1/\alpha]{\beta^{\alpha}}=\beta$ , and $\sum_{x\in\mathcal{X}}p_{X|Y}(x|y)=1$ . Inequality (A.3) is implied by utilizing the Jensen’s inequality for concave function $\sqrt[\alpha]{\cdot}$ .

Next, we claim that for arbitrary $y\in\mathcal{Y}$ and $a\in\mathcal{A}$ ,

[TABLE]

Therefore, Equations (A.3) and (A.4) imply

[TABLE]

The above equations fulfills the proof. Thus, we only need to prove the claim of Equation (A.4). Instead of proving Equation (A.4), we prove Equation (A.5) which is obtained by replacing $p_{X|Y}(x|y)$ with an arbitrary real function $g:\mathcal{X}\to\mathbb{R}$ :

[TABLE]

In order to interpret the above inequality, let us define $\sigma$ -finite measure spaces $(\mathcal{X},\Sigma_{\mathcal{X}},\mu_{\mathcal{X}})$ and $(\mathcal{F},\Sigma_{\mathcal{F}},\mu_{\mathcal{F}})$ , where for all $x\in\mathcal{X}$ , $\mu_{\mathcal{X}}(x)=1$ , and for all $f\in\mathcal{F}$ , $\mu_{\mathcal{F}}(f)=p_{F}(f)$ . Furthermore, let $T:\mathcal{G}_{\mathcal{X}}\to\mathcal{G}_{\mathcal{F}}$ be a linear operator that maps $\mathcal{G}_{\mathcal{X}}$ (the set of real valued functions on $\mathcal{X}$ ) to $\mathcal{G}_{\mathcal{F}}$ (the set of real valued functions on $\mathcal{F}$ ) and is defined as below

[TABLE]

Moreover, consider the following definition:

Definition 30.

Let $(\mathcal{Y},\Sigma_{\mathcal{Y}},\mu_{\mathcal{Y}})$ and $(\mathcal{Z},\Sigma_{\mathcal{Z}},\mu_{\mathcal{Z}})$ be $\sigma$ -finite measure spaces, and $h:\mathcal{Y}\to\mathbbm{R}$ be a real function on the measure space $(\mathcal{Y},\Sigma_{\mathcal{Y}},\mu_{\mathcal{Y}})$ . For arbitrary $\beta_{1}>0$ , the $\beta_{1}$ -norm of $h$ is denoted by $\|h\|_{L^{\beta_{1}}(\mu_{\mathcal{Y}})}$ and is defined as below:

[TABLE]

$L^{\beta_{1}}(\mu_{\mathcal{Y}})$ * denotes the set of real functions $h:\mathcal{Y}\to\mathbbm{R}$ with bounded $\beta_{1}$ -norm, i.e., $\|h\|_{L^{\beta_{1}}(\mu_{\mathcal{Y}})}<\infty$ . For arbitrary $\beta_{2}>0$ , let $M:L^{\beta_{1}}(\mu_{\mathcal{Y}})\to L^{\beta_{2}}(\mu_{\mathcal{Z}})$ be an operator that maps the real functions on the measure space $(\mathcal{Y},\Sigma_{\mathcal{Y}},\mu_{\mathcal{Y}})$ to the real functions on the measure space $(\mathcal{Z},\Sigma_{\mathcal{Z}},\mu_{\mathcal{Z}})$ . $\|M\|_{L^{\beta_{1}}(\mu_{\mathcal{Y}})\to L^{\beta_{2}}(\mu_{\mathcal{Z}})}$ denotes the operator norm of $M$ defined as follows:*

[TABLE]

Using the above definition and Equation (A.6), we can rewrite Equation (A.5) as follows

[TABLE]

In order to prove Equation (A.7), it suffices to prove it for the special cases $\alpha=1$ and $\alpha=2$ , then, the general form with arbitrary $\alpha\in[1,2]$ will be concluded from the well-known Riesz-Thorin interpolation theorem.

Theorem 31 (Riesz-Thorin Interpolation Theorem).

Let $(\Omega_{1},\Sigma_{1},\mu_{1})$ and $(\Omega_{2},\Sigma_{2},\mu_{2})$ be arbitrary $\sigma$ -finite measure spaces. Suppose $0\leq r_{0}\leq r_{1}\leq\infty$ , $0\leq s_{0}\leq s_{1}\leq\infty$ , and let $T$ be an arbitrary linear operator that maps $L^{r_{0}}(\mu_{1})$ and $L^{r_{1}}(\mu_{1})$ boundedly into $L^{s_{0}}(\mu_{2})$ and $L^{s_{1}}(\mu_{2})$ , respectively. For arbitrary $0\leq\theta\leq 1$ , let $1/r_{\theta}=(1-\theta)/r_{0}+\theta/r_{1}$ and $1/s_{\theta}=(1-\theta)/s_{0}+\theta/s_{1}$ , then, $T$ maps $L^{r_{\theta}}(\mu_{1})$ boundedly into $L^{s_{\theta}}(\mu_{2})$ and satisfies the operator norm estimate

[TABLE]

We complete the proof by proving Equation (A.7), or equivalently Equation (A.5), for $\alpha=1$ and $\alpha=2$ . For $\alpha=1$ , we have

[TABLE]

where (A.8) follows from the property of the random mapping $F$ that $\text{\rm{Pr}}[F(x)=a]=p_{A}(a)$ . Therefore, Equation (A.5) holds for $\alpha=1$ .

For $\alpha=2$ we have

[TABLE]

where (A.10) follows from the fact that for distinct $x$ and $x^{\prime}$ , $F(x)$ is independent of $F(x^{\prime})$ , and $\sum_{f\in\mathcal{F}}P_{F}(f)\mathbbm{1}(f(x)=a)=\text{\rm{Pr}}[F(x)=a]=P_{A}(a)$ . Equation (A.11) is implied by $\text{\rm{Pr}}[F(x)=a]=p_{A}(a)$ .

Appendix Appendix B Proof of Proposition 15

$F$ is fully described by its elements $\{F(x)\}_{x\in\mathcal{X}}$ , and hence, $D_{TV}$ is a deterministic function of $\{F(x)\}_{x\in\mathcal{X}}$ , i.e.,

[TABLE]

where $g:\mathcal{A}^{|\mathcal{X}|}\to\mathbbm{R}$ is a deterministic function defined as follows:

[TABLE]

Since $D_{TV}$ is a function of independent random variables, we utilize the McDiarmid’s inequality (McDiarmid (1989)).

Let $f,\tilde{f}\in\mathcal{F}$ be two arbitrary mappings with equal assignments for all elements of $\mathcal{X}$ except for some element $x_{0}$ , i.e.,

[TABLE]

Then, we have:

[TABLE]

Therefore, we have

[TABLE]

Furthermore, recall that $\{F(x)\}_{x\in\mathcal{X}}$ are independent random variables. Hence, McDiarmid’s inequality (McDiarmid (1989)) implies:

[TABLE]

Appendix Appendix C Proof of Theorem 27

In Appendix C.1 we show that provided the conditions of Theorem 27, $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium is achievable by autonomous strategies (achievability proof), and in Appendix C.2, we show that if the autonomous strategy profile $(\sigma^{n},\tau^{n})$ forms an $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium, then, $\epsilon_{A}$ and $\epsilon_{B}$ satisfy the conditions of Theorem 27 (converse proof).

Appendix C.1 Achievability proof

Let $A$ , $B$ and $Q$ be the random variables in the statement of Theorem 27 for which the entropy constraints hold strictly, i.e., $H(A|Q)<H(X)$ , and $H(B|Q)<H(Y)$ . Take $n$ of the form $n=NL$ , and divide the total $n$ stages into $L$ blocks of $N$ stages. We generate the action sequence of each block (except for the first block) as a function of the random source observed during the previous block. In all stages of the first block, fixed actions $a\in\mathcal{A}$ and $b\in\mathcal{B}$ are played by Alice and Bob, respectively. Furthermore, excluding the first block, each block is further divided into four subblocks each of which include the following set of stages:

[TABLE]

We generate strategies of Alice and Bob in such a way that they use the randomness source observed in last block to generate the actions of current block. We would like the generated actions of Alice and Bob to be almost i.i.d. according to respective distributions $p_{A|Q=q}$ and $p_{B|Q=q}$ during each subblock $\mathcal{I}_{q}$ for all $q\in\{0,1,2,3\}$ . Moreover, the action played in each stage should be also independent of the other player’s observations up to that stage. As $H(A|Q)<H(X)$ and $H(B|Q)<H(Y)$ , intuitively, each player could generate his/her actions with the intended pmf as a function of his/her corresponding randomness source observed during the previous block. Then, $\epsilon^{(A)}(A,B|Q)\leq\epsilon_{A}$ and $\epsilon^{(B)}(A,B|Q)\leq\epsilon_{B}$ would imply that in limit, the constructed strategies form an $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium. We will now present the above sketch of proof more precisely.

For arbitrary $i\in\{1,2,\dots,L\}$ , let $A_{i}^{N}=(A_{i,1},\dots,A_{i,N})$ and $B_{i}^{N}=(B_{i,1},\dots,B_{i,N})$ denote the actions of Alice and Bob, respectively, and let $X_{i}^{N}=(X_{i,1},\dots,X_{i,N})$ and $Y_{i}^{N}=(Y_{i,1},\dots,Y_{i,N})$ be the randomness sources observed by Alice and Bob, respectively, during block number $i$ . Let us construct the strategies $\sigma^{n}$ for Alice and $\tau^{n}$ for Bob as follows: Alice and Bob choose their actions in each block $i\geq 2$ as a deterministic function of the corresponding sequence of random sources observed during the previous block, i.e., block number $i-1$ . Particularly, there exist a sequence of deterministic mappings $\{\varphi_{i}\}_{i=2}^{L}$ and $\{\psi_{i}\}_{i=2}^{L}$ such that for all $i\geq 2$ we have $A_{i}^{N}=\varphi_{i}(X_{i-1}^{N})$ , and $B_{i}^{N}=\psi_{i}(Y_{i-1}^{N})$ . In all stages of the first block, both Alice and Bob choose an arbitrary fixed action. Next, we fulfill the specification of the strategies $\sigma^{n}$ and $\tau^{n}$ by specifying the functions $\{\varphi_{i}\}_{i=2}^{L}$ and $\{\psi_{i}\}_{i=2}^{L}$ .

Let $q_{A^{N}}$ and $q_{B^{N}}$ be ideal distributions defined as follows:

[TABLE]

For arbitrary $i\geq 2$ , let $\varphi_{i}$ be the mapping of Lemma 14 that simulates $q_{A^{N}}$ from $X_{i-1}^{N}$ , and let $\psi_{i}$ be the mapping of Lemma 14 that simulates $q_{B^{N}}$ from $Y_{i-1}^{N}$ . Thus, for all $i\geq 2$ and $1\leq\alpha\leq 2$ , we have

[TABLE]

Note that $H_{\alpha}(X_{i-1}^{N})=NH_{\alpha}(X)$ , and

[TABLE]

where the first inequality follows from $|\mathcal{I}_{q}|\leq p_{Q}(q)N+1$ and $H_{\frac{1}{\alpha}}(p_{A|Q=q})\leq\log|\mathcal{A}|$ ; the second inequality holds for sufficiently large $N$ . Therefore, Equation (C.3) implies

[TABLE]

Note that $\lim_{\alpha\to 1}\left\{H_{\alpha}(X)-\sum_{q=0}^{3}p_{Q}(q)H_{\frac{1}{\alpha}}(p_{A|Q=q})\right\}=H(X)-H(A|Q)>0$ ; thus, there exits a $\alpha>1$ such that $H_{\alpha}(X)-\sum_{q=0}^{3}p_{Q}(q)H_{\frac{1}{\alpha}}(p_{A|Q=q})>0$ . Therefore, Equation (C.5) implies that for arbitrary $\delta>0$ , one can choose $N$ large enough so that

[TABLE]

A similar argument concludes that for sufficiently large $N$ we have

[TABLE]

Note that $X_{i-1}^{N}$ is independent of $Y_{i-1}^{N}$ , thus, $A_{i}^{N}$ is independent of $B_{i}^{N}$ , i.e., $p_{A_{i}^{N}B_{i}^{N}}=p_{A_{i}^{N}}p_{B_{i}^{N}}$ . This fact along with Equations (C.6), (C.7) and (C.2) implies

[TABLE]

where we utilized the third property of total variation in Lemma 1. Note that $p_{A_{i}^{N}B_{i}^{N}}(a^{N},b^{N})$ is the actual distribution of actions in block number $i$ , while $\prod_{q=0}^{3}\prod_{t\in\mathcal{I}_{q}}p_{A|Q}(a_{t}|q)p_{B|Q}(b_{t}|q)$ is the ideal one. Next, for arbitrary $\delta^{\prime}>0$ , we have

[TABLE]

where the first inequality follows from the following two facts:

The expected payoff of the first block is bounded below by $-\mathsf{M}N$ , where $\mathsf{M}=\max_{(a,b)\in\mathcal{A}\times\mathcal{B}}|u_{ab}|$ . 2. 2.

Inequality C.8 implies that excluding the first block, the expected payoff of each block is in at most $4\mathsf{M}N\delta$ distance of the expected payoff induced by the ideal distribution of actions.

The second inequality holds for sufficiently large $L$ and $N$ , because $(L-1)|\mathcal{I}_{q}|/NL$ tends to $p_{Q}(q)$ , and $\mathsf{M}/L$ tends to zero as $N$ and $L$ tend to infinity.

Next, we prove that the strategies $\sigma^{n}$ and $\tau^{n}$ constructed above form the desired approximate Nash equilibrium. Consider an arbitrary strategy $\hat{\sigma}^{n}$ (not necessarily an autonomous strategy) for Alice, and let Alice and Bob play according to strategy profile $(\hat{\sigma}^{n},\tau^{n})$ . In this case, let $\hat{A}^{N}_{i}$ and $\hat{B}^{N}_{i}$ denote the sequence of actions of Alice and Bob in block number $i\geq 1$ ; moreover, let $\hat{X}_{i}^{N}$ and $\hat{Y}_{i}^{N}$ denote the sequence of randomness sources observed during block number $i\geq 1$ . Observe that $\tau^{n}$ is an autonomous strategy, thus, changing the strategy of Alice from $\sigma^{n}$ to $\hat{\sigma}^{n}$ has not any impact on the actions of Bob; hence, the sequence of actions of Bob $(\hat{B}^{N}_{i})$ still satisfies the property of Equation (C.7), i.e.,

[TABLE]

Note that at $t$ -th stage of block number $i$ , Alice finds information about $\hat{Y}_{i-1}^{N}$ just through $\hat{B}_{i}^{t-1}$ ; thus, $\hat{A}_{i,t}$ is independent of $\hat{Y}_{i-1}^{N}$ given $\hat{B}_{i}^{t-1}$ . On the other hand, $(\hat{B}_{i,t},\hat{B}_{i,t+1},\dots,\hat{B}_{i,N})$ is a deterministic function of $\hat{Y}_{i-1}^{N}$ . Therefore, $\hat{A}_{i,t}$ is also independent of $(\hat{B}_{i,t},\hat{B}_{i,t+1},\dots,\hat{B}_{i,N})$ given $\hat{B}_{i}^{t-1}$ . Hence,

[TABLE]

Equations (C.10) and (C.11) along with the first property of total variation in Lemma 1 imply that for all $i\geq 2$ , we have

[TABLE]

Note that the ideal distribution $\prod_{q=0}^{3}\prod_{t\in\mathcal{I}_{q}}p_{B|Q}(b_{t}|q)p_{\hat{A}_{i,t}|\hat{A}^{t-1}_{i}\hat{B}^{t-1}_{i}}(a_{t}|a^{t-1},b^{t-1})$ guarantees that the expected payoff of block number $i\geq 2$ is no more than

[TABLE]

On the other hand, Equation (C.12) implies that the actual expected payoff of arbitrary block number $i\geq 2$ is in $2\mathsf{M}N\delta$ distance of the ideal one. Thus, the actual expected payoff of block number $i\geq 2$ is no more than

[TABLE]

Furthermore, the expected payoff of the first block is bounded above by $N\mathsf{M}$ ; thus,

[TABLE]

where the second inequality holds for sufficiently large $L$ and $N$ .

Equations C.9 and C.13 conclude

[TABLE]

where the second inequality follows from the assumption in the statement of the theorem that $g_{A}(A,B|Q)\leq\epsilon_{A}$ . By a similar argument as above, we can show that for arbitrary strategy $\hat{\tau}^{n}$ for Bob we have

[TABLE]

Inequalities (C.14) and (C.15) imply that the strategy profile $(\sigma^{n},\tau^{n})$ forms a $(\epsilon_{A}+2\delta^{\prime}+6\mathsf{M}\delta,\epsilon_{B}+2\delta^{\prime}+6\mathsf{M}\delta)$ -Nash equilibrium. But one can choose $\delta$ and $\delta^{\prime}$ small enough to make $2\delta^{\prime}+6\mathsf{M}\delta$ as small as desired; thus, $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium is achievable.

Appendix C.2 Converse proof

Let $(\sigma^{n},\tau^{n})$ be an autonomous strategy profile generating an $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium in the $n$ stage repeated game, and let $A^{n}$ and $B^{n}$ be the sequence of actions of Alice and Bob. Let $T$ be a random variable chosen from $\{1,2,\dots,n\}$ uniformly and independent of $(X^{n},Y^{n},A^{n},B^{n})$ . Let us define

[TABLE]

We show that the random variables $Q,\tilde{A},\tilde{B}$ along with $\epsilon_{A}$ and $\epsilon_{B}$ satisfy the conditions of Theorem 27.

The Markov conditions: In this part, our goal is to show that $\tilde{A}$ is independent of $\tilde{B}$ given $Q$ ,i.e., $p_{\tilde{A}\tilde{B}Q}=p_{Q}p_{\tilde{A}|Q}p_{\tilde{B}|Q}$ . To do this, it suffices to prove that for all $t\geq 1$ , $A_{t}$ is independent of $B_{t}$ given $(A^{t-1},B^{t-1})$ . Note that the strategies $\sigma^{n}$ and $\tau^{n}$ are autonomous, hence, $A^{n}$ is a deterministic function of $X^{n}$ , and $B^{n}$ is a deterministic function of $Y^{n}$ ; thus, the fact that $X^{n}$ is independent of $Y^{n}$ implies that $A^{n}$ is independent of $B^{n}$ . Therefore, $A_{t}$ is independent of $B_{t}$ given $(A^{t-1},B^{t-1})$ .

Entropy conditions: We show that $H(\tilde{A}|Q)\leq H(X)$ :

[TABLE]

where (C.16) follows from the independence of $T$ from $(A^{n},B^{n})$ . Recall that $X^{n}$ is independent of $Y^{n}$ , thus, $A^{n}$ –a deterministic function of $X^{n}$ – is independent of $B^{n}$ –a deterministic function of $Y^{n}$ . Hence, Equation (C.17) is correct. Equation (C.18) is implied by the fact that $A^{n}$ is a deterministic function of $X^{n}$ . A similar argument justifies $H(\tilde{B}|Q)\leq H(Y)$ .

Equilibrium conditions: In this part we show that $g_{A}(\tilde{A},\tilde{B}|Q)\leq\epsilon_{A}$ . To do this, we consider a new game in which Alice and Bob play according to strategy profile $(\hat{\sigma}^{n},\tau^{n})$ , where $\hat{\sigma}^{n}$ will now be constructed. In the new game, let $\hat{A}^{n}$ and $\hat{B}^{n}$ denote the respective actions of Alice and Bob; furthermore, let $\hat{X}^{n}$ and $\hat{Y}^{n}$ denote the respective sources of randomness of Alice and Bob, and $\hat{\mathsf{H}}^{t}_{1}$ denote the history of observations of Alice until stage $t$ . Note that $\tau^{n}$ is the same strategy as considered in the beginning of the converse proof, whereas $\hat{\sigma}^{n}=(\hat{\sigma}_{1},\dots,\hat{\sigma}_{n})$ is generated as follows: given $\hat{h}_{1}^{t}$ , an arbitrary history of observations of Alice until stage $t$ , $\hat{\sigma}_{t}(\hat{h}_{1}^{t})$ is the best choice of Alice that maximizes the expected payoff at stage $t$ , i.e.,

[TABLE]

where $\mathbb{E}_{\tau^{n}}$ denotes the expectation with respect to the probability distribution induced by $\tau^{n}$ and $p_{Y}$ . We have

[TABLE]

where Equation (C.19) results from the following two facts:

$\hat{A}^{t-1}$ is a deterministic function of $(\hat{X}^{t},\hat{B}^{t-1})$ , thus, $\hat{B}_{t}$ is independent of $\hat{A}^{t-1}$ given $(\hat{X}^{t},\hat{B}^{t-1})$ . 2. 2.

$\tau^{n}$ is an autonomous strategy, thus, $\hat{B}^{t}$ is a deterministic function of $\hat{Y}^{t}$ . On the other hand, $\hat{Y}^{t}$ is independent of $\hat{X}^{t}$ . Therefore, $\hat{B}^{t}$ is independent of $\hat{X}^{t}$ .

Note again that $\tau^{n}$ is an autonomous strategy, thus, the probability distribution of the actions of Bob is not related to the strategy of Alice; hence, $p_{\hat{B}^{n}}=p_{B^{n}}$ (recall that $B^{n}$ is the sequence of actions of Bob in the original game in which strategy profile $(\sigma^{n},\tau^{n})$ is played); as a result, Equation (C.20) holds. Since $\sigma^{n}$ and $\tau^{n}$ are autonomous strategies, $A^{n}$ is a deterministic function of $X^{n}$ , and $B^{n}$ is a deterministic function of $Y^{n}$ . This fact along with the independence of $X^{n}$ from $Y^{n}$ implies that $A^{n}$ is independent of $B^{n}$ , thus Equation (C.21) follows.

Furthermore, the expected average payoff induced by $(\sigma^{n},\tau^{n})$ equals

[TABLE]

Equations (C.22) and (C.23) imply

[TABLE]

The above equation along with the fact that $(\sigma^{n},\tau^{n})$ is an $(\epsilon_{A},\epsilon_{B})$ -Nash equilibrium implies that $g_{A}(A,B|Q)\leq\epsilon_{A}$ . Using a similar argument as above, one can show that $g_{B}(A,B|Q)\leq\epsilon_{B}$ .

Cardinality bound: The identified random variables $(Q,\tilde{A},\tilde{B})$ satisfy the constraints of the theorem, except the cardinality bound on $Q$ . Cardinality of $Q$ can be reduced using standard arguments such as the support lemma of (El Gamal and Kim, 2011, Appendix C), or the Fenchel-Bunt extension to the Caratheodory’s theorem. We modify $p_{Q\tilde{A}\tilde{B}}$ and generate a new distribution $p^{\prime}_{Q\tilde{A}\tilde{B}}$ so that it also satisfies the cardinality bound. Let $p^{\prime}_{\tilde{A}\tilde{B}|Q}=p_{\tilde{A}\tilde{B}|Q}$ ; this guarantees that under $p^{\prime}$ , given $Q$ , $\tilde{A}$ is independent of $\tilde{B}$ . Next, we complete the definition of $p^{\prime}_{Q\tilde{A}\tilde{B}}$ by specifying the marginal distribution $p^{\prime}_{Q}$ . We can perceive $p^{\prime}_{Q}$ as a real vector $[p^{\prime}_{Q}(q),q\in\mathcal{Q}]$ satisfying the following linear constraints:

[TABLE]

where $\mathcal{Q}$ is the sample space of the random variable $Q$ .

Let $H^{\prime}(\tilde{A}|Q)$ denote the entropy of $\tilde{A}$ given $Q$ , under the distribution $p^{\prime}_{\tilde{A}\tilde{B}Q}$ . Similarly, the prime superscript in $H^{\prime}(\tilde{B}|Q)$ , $g^{\prime}_{A}(\tilde{A},\tilde{B}|Q)$ and $g^{\prime}_{B}(\tilde{A},\tilde{B}|Q)$ indicates that they are computed according to probability distribution $p^{\prime}_{\tilde{A}\tilde{B}Q}$ . We drop the superscript when the evaluation is done under the original probability distribution $p_{\tilde{A}\tilde{B}Q}$ . Assume that $p^{\prime}_{Q}$ also satisfies the following linear constraints:

[TABLE]

Let $\mathsf{P}$ denote the polytope of marginal distributions $p^{\prime}_{Q}$ satisfying (C.24)-(C.28). Note that $\mathsf{P}$ is not empty since it contains $p_{Q}$ . We choose $p^{\prime}_{Q}$ to be an element of $\mathsf{P}$ that minimizes $g^{\prime}_{B}(\tilde{A},\tilde{B}|Q)$ . This guarantees that $p^{\prime}_{\tilde{A}\tilde{B}Q}$ inherits the following properties from $p_{\tilde{A}\tilde{B}Q}$ :

[TABLE]

We will now show that $p^{\prime}_{\tilde{A}\tilde{B}Q}$ also satisfies the cardinality bound on the support of $Q$ . Note that $g^{\prime}_{B}(\tilde{A},\tilde{B}|Q)$ is linear in $p^{\prime}_{Q}$ , hence, it’s minimum occurs in a vertex of the polytope $\mathsf{P}$ . polytope $\mathsf{P}$ lies in a $|\mathcal{Q}|$ dimensional space, and hence, each of it’s vertices lies in at least $|\mathcal{Q}|$ out of the $|\mathcal{Q}|+4$ hyperplanes defining $\mathsf{P}$ (Equations (C.24)-(C.28)) . Therefore, $p^{\prime}_{Q}$ , which is a vertex of $\mathsf{P}$ , lies in at least $|\mathcal{Q}|-4$ out of $|\mathcal{Q}|$ hyperplanes of the form (C.24). Hence, $p^{\prime}_{Q}$ has at most 4 non-zero elements.

Bibliography20

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Beck and Schögl (1995) Beck, C., Schögl, F., 1995. Thermodynamics of chaotic systems: an introduction. No. 4. Cambridge University Press.
2Bernardini and Rinaldo (2018) Bernardini, R., Rinaldo, R., 2018. Generalized elias schemes for efficient harvesting of truly random bits. International Journal of Information Security 17 (1), 67–81.
3Budinich and Fortnow (2011) Budinich, M., Fortnow, L., 2011. Repeated matching pennies with limited randomness. In: Proceedings of the 12th ACM conference on Electronic commerce. ACM, pp. 111–118.
4El Gamal and Kim (2011) El Gamal, A., Kim, Y.-H., 2011. Network information theory. Cambridge university press.
5Elias (1972) Elias, P., 1972. The efficient construction of an unbiased random sequence. The Annals of Mathematical Statistics, 865–870.
6Gossner and Vieille (2002) Gossner, O., Vieille, N., 2002. How to play with a biased coin? Games and Economic Behavior 41 (2), 206–226.
7Han (2003) Han, T. S., 2003. Information-spectrum methods in information theory. Vol. 50. Springer -Verlag Berlin Heidelberg.
8Hayashi (2011) Hayashi, M., 2011. Exponential decreasing rate of leaked information in universal random privacy amplification. IEEE Transactions on Information Theory 57 (6), 3989–4001.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Playing Games with Bounded Entropy: Convergence Rate and Approximate Equilibria

Abstract

keywords:

Contents

1 Introduction

1.1 A new tool

2 Preliminaries

2.1 Notations

Lemma 1**.**

2.2 Shannon Entropy

2.3 Rényi Entropy

3 Repeated games with leaked randomness source: convergence rate

3.1 Problem statement and results

Remark 2**.**

Definition 3**.**

Definition 4**.**

Theorem 5** (Valizadeh and Gohari (2017)).**

Theorem 6**.**

Corollary 7**.**

Discussion 8**.**

Theorem 9**.**

Corollary 10**.**

Theorem 11**.**

3.2 A technical tool: simulation of a source from another source

Definition 12**.**

Definition 13**.**

Lemma 14**.**

Proposition 15**.**

3.3 Proof of Theorem 6

Lemma 16**.**

3.3.1 Proof of Lemma 16

3.4 Proof of Theorem 9

3.5 Proof of Theorem 11

4 Approximate Nash equilibria of the repeated game with leaked randomness source

Definition 17**.**

Definition 18**.**

Definition 19**.**

Definition 20**.**

Definition 21**.**

Corollary 22**.**

Theorem 23**.**

Remark 24**.**

Corollary 25**.**

4.1 Proof of Theorem 23

4.2 Approximate Nash equilibria achieved by autonomous strategies

Definition 26**.**

Theorem 27**.**

Example 28**.**

Remark 29**.**

Appendix Appendix A Proof of Lemma 14

Definition 30**.**

Theorem 31** (Riesz-Thorin Interpolation Theorem).**

Appendix Appendix B Proof of Proposition 15

Appendix Appendix C Proof of Theorem 27

Appendix C.1 Achievability proof

Appendix C.2 Converse proof

Lemma 1.

Remark 2.

Definition 3.

Definition 4.

Theorem 5 (Valizadeh and Gohari (2017)).

Theorem 6.

Corollary 7.

Discussion 8.

Theorem 9.

Corollary 10.

Theorem 11.

Definition 12.

Definition 13.

Lemma 14.

Proposition 15.

Lemma 16.

Definition 17.

Definition 18.

Definition 19.

Definition 20.

Definition 21.

Corollary 22.

Theorem 23.

Remark 24.

Corollary 25.

Definition 26.

Theorem 27.

Example 28.

Remark 29.

Definition 30.

Theorem 31 (Riesz-Thorin Interpolation Theorem).