Playing Games with Bounded Entropy: Convergence Rate and Approximate Equilibria
Mehrdad Valizadeh, Amin Gohari

TL;DR
This paper analyzes the convergence rate of the long-run value in zero-sum repeated games with limited randomness strategies and characterizes the set of approximate Nash equilibria related to this value.
Contribution
It introduces a new simulation tool for sources based on Rénnyi entropies and characterizes approximate equilibria in games with bounded entropy strategies.
Findings
Convergence rate of $v_n$ to its limit is exponentially bounded.
Simulation precision depends on Rénnyi entropy difference.
Set of approximate equilibria closely relates to the long-run max-min value.
Abstract
We consider zero-sum repeated games in which the players are restricted to strategies that require only a limited amount of randomness. Let be the max-min value of the stage game; previous works have characterized , i.e., the long-run max-min value. Our first contribution is to study the convergence rate of to its limit. To this end, we provide a new tool for simulation of a source (target source) from another source (coin source). Considering the total variation distance as the measure of precision, this tool offers an upper bound for the precision of simulation, which is vanishing exponentially in the difference of R\'enyi entropies of the coin and target sources. In the second part of paper, we characterize the set of all approximate Nash equilibria achieved in long run. It turns out that this set is in close relation with the long-run…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematical Biology Tumor Growth · Game Theory and Applications · Markov Chains and Monte Carlo Methods
Playing Games with Bounded Entropy: Convergence Rate and Approximate Equilibria
Mehrdad Valizadeh
Amin Gohari
Department of Electrical Engineering, Sharif University of Technology, Tehran, Iran
Abstract
We consider zero-sum repeated games in which the players are restricted to strategies that require only a limited amount of randomness. Let be the max-min value of the stage game; previous works have characterized , i.e., the long-run max-min value. Our first contribution is to study the convergence rate of to its limit. To this end, we provide a new tool for simulation of a source (target source) from another source (coin source). Considering the total variation distance as the measure of precision, this tool offers an upper bound for the precision of simulation, which is vanishing exponentially in the difference of Rényi entropies of the coin and target sources. In the second part of paper, we characterize the set of all approximate Nash equilibria achieved in long run. It turns out that this set is in close relation with the long-run max-min value.
keywords:
Repeated Games , Bounded Entropy , Randomness Extraction , Source Simulation , Information Theory
††journal: Games and Economic Behavior
Contents
-
3 Repeated games with leaked randomness source: convergence rate
-
3.2 A technical tool: simulation of a source from another source
-
4 Approximate Nash equilibria of the repeated game with leaked randomness source
-
4.2 Approximate Nash equilibria achieved by autonomous strategies
1 Introduction
Nash (1950) showed that all one-shot games have at least one equilibrium in the mixed strategies. Private randomness is required to implement mixed strategies, and consequently a Nash equilibrium may not exist if insufficient random bits are available to the players (See Hubáček et al. (2016) and Budinich and Fortnow (2011)).
Limited randomness in repeated zero-sum games was originally studied by Neyman and Okada (2000) and Gossner and Vieille (2002). Gossner and Vieille (2002) studied a repeated zero-sum game between Alice (the maximizer) and Bob (the minimizer). At the beginning of each stage of the game, Alice observed an independent drawing of a random source with a commonly known distribution. Next, the players played an action which was monitored by the other player. The only source of randomization available to Alice was the outcomes of random source . Thus, Alice had to choose the action of each stage as a deterministic function of the history of her observations, i.e., the random sources revealed up to that stage and the previous actions. However, Bob could freely randomize his actions, and hence, at each stage, he chose his action as a random function of the actions played previously. Generalizing the model of Gossner and Vieille (2002), Valizadeh and Gohari (2017) considered the possibility of leakage of Alice’s random source sequence to Bob; thus, they called it the repeated game with leaked randomness source. In other words, Bob monitored the random source of Alice through a noisy channel. Specifically, let , be a sequence of independent and identically distributed (i.i.d.) random variables distributed according to a given distribution . At arbitrary stage , before choosing the actions for that stage, Alice observed , and Bob observed . In this model, Alice and Bob could randomize their actions at each stage just by conditioning their actions to the history of their observations up to that stage.
In this paper, we study two different aspects of the repeated game with leaked randomness sources. Our first contribution is to study the max-min payoff that Alice can secure in a repeated game with finite number of stages. Note that Valizadeh and Gohari (2017) characterized the long run max-min value, i.e., the maximum payoff that Alice can secure regardless of what strategy Bob chooses when the number of stages tends to infinity. More precisely, let be the max-min value of the -stage repeated game with leaked randomness source. Valizadeh and Gohari (2017) characterized . In this paper, we investigate how converges to its limit. To do so, we develop and utilize a new tool for simulation of a source from another source, which we will introduce later in Section 1.1.
Our second contribution is to study the set of equilibria that is implementable by Alice and Bob in the repeated game with leaked randomness sources. As stated above, implementable Nash equilibria do not necessarily exist. However, a relaxed version of Nash equilibria called approximate Nash equilibria may exist. Let and be arbitrary positive numbers. We say a given strategy profile forms a -Nash equilibrium if Alice and Bob do not gain more than and , respectively, by unilaterally changing their corresponding strategies. We characterize the set of -Nash equilibria of the repeated game when the number of stages of the game tends to infinity. This set is characterized in terms of the maximum payoffs that Alice and Bob can secure in long run (long run max-min and min-max values).
Note that in previous works (Neyman and Okada (2000); Gossner and Vieille (2002); Valizadeh and Gohari (2017)), the max-min (or min-max) value of the zero-sum repeated game was achieved by autonomous strategies – a strategy that is indifferent about the actions of the opponent in past stages. Therefore, we address the question as to whether autonomous strategies are sufficient for achieving all implementable approximate Nash equilibria. To do this, we also characterize the set of all approximate Nash equilibria achieved by autonomous strategies in long run. It will turn out that the set of approximate equilibria achieved by autonomous strategies is absolutely smaller than the set of approximate equilibria achieved by arbitrary strategies.
1.1 A new tool
A key step in the proofs of Gossner and Vieille (2002) and Valizadeh and Gohari (2017) is to divide the total stages of the repeated game into some blocks such that the actions of the first player in each block (excluding the first block) is generated as a function of the randomness source observed during the previous block.111This strategy is known as the block Markov strategy in information theory and utilized in multi-hop communication settings. In other words, the actions of the first player in each block is simulated from the randomness source observed in the previous block. Since we are interested in the non-asymptotic regime where the number of stages is given and fixed, we need to carefully optimize over the length of the blocks and also prove a fine estimate on the accuracy of simulation of the actions of each block from the observations of the previous block. Thus, in order to study the repeated game with stages, we provide a new tool for simulation of a source from another source which is of independent interest.
More precisely, in abstract terms, let and be arbitrary discrete random variables distributed according to some probability mass function , and let be a target random variable distributed according to . We would like to simulate from (by using a deterministic function ) in such a way that the resulting random variable, , is almost independent of , and its distribution is close to . Intuitively, if the amount of uncertainty of given is much more than the amount of uncertainty of , then, one might find a simulator satisfying the above conditions. We take the Rényi entropy as our measure of uncertainty, and the total variation distance as our measure of similarity. We prove that there exists a mapping such that for arbitrary ,
[TABLE]
where denotes the total variation distance, denotes the conditional Rényi entropy (with parameter ) of given , and is the Rényi entropy of with parameter . The main idea to prove Equation (1) is to relate it to norms of linear maps and then utilize the Riesz-Thorin Interpolation Theorem.
To better understand Equation (1), let us apply it to a sequence of random variables. Assume that , , …, are i.i.d. repetitions according to . Our goal is to simulate , which is an i.i.d. sequence according to . Applying Equation (1) to , , and , we obtain that there exists a mapping such that for arbitrary ,
[TABLE]
where we used the fact that and . Equation (2) shows that the accuracy of simulation is improving exponentially fast in the product of three terms: the block length , the term , and the entropy difference .
Moreover, Equation (1) can be interpreted in a different way: we say that is a measure of randomness if for any discrete random variable , is a non-negative real number. The value quantifies the amount of uncertainty in . Then, is a reasonable measure of randomness only if it is non-increasing under mappings. In other words, if random variable is a deterministic function of random variable , we expect . The question then arises whether the converse to this statement can also be true:
Question: Is there a suitable measure of randomness such that if and only if there is a function such that is distributed according to ?
While the answer to this question is negative, our tool shows that an approximate version of it holds. To see why the answer to this question is negative, let be a binary random variable. Then, has the same amount of randomness as if is a one-to-one function (), and is deterministic if . Therefore, and cannot take values lying between [math] and . However, if we require to have a distribution that is “approximately” equal to , the above question can be revisited. In fact, our tool shows that Rényi entropy is an answer for the approximate version of the above question.
Relation of Equation (1) to previous works: The problem of simulation of a source from another source dates back to the work of Von Neumann (1951), who considered the problem of generating a sequence of i.i.d. fair bits from a given sequence of i.i.d. unfair bits. The algorithm presented by Von Neumann (1951) is universal in the sense that it does not need the knowledge of the distribution of the input bits, and it is exact in the sense that the output bits are exactly fair. Von Neumann (1951) also offered a non-universal exact algorithm for simulation of a desired continuous distribution from a given continuous random variable with known distribution. A generalization of the algorithm of Von Neumann (1951) for arbitrary Markov inputs can be found in Elias (1972) and Bernardini and Rinaldo (2018). There are other works that have considered non-exact simulation of a source. Considering the total variation distance as the measure of accuracy, Yassaee et al. (2014) studied non-universal generation of independent fair bits from an i.i.d. sequence of random variables with side information, and Han (2003) considered the simulation of a general sequence from a general input sequence with known distribution. Fundamental limits for generation of arbitrary random sequence from a general sequence of random variables under different measures of accuracy has been studied by Vembu and Verdú (1995) and Yu and Tan (2019).
Above works considered the simulation of an intended long sequence from a long input sequence. In contrast, a different approach for generating random bits (randomness extraction) is to provide results for arbitrary single-letter sources, and then, conclude results for sequences; works of Renner (2008), Hayashi (2011) and Mojahedian et al. (2018) on randomness extraction and privacy amplification lie in this category. The tool we present in this paper generalizes the results of Renner (2008), Hayashi (2011) and Mojahedian et al. (2018); in fact, they considered the special case of simulation of random variable having a uniform distribution over a set (when is uniform, simulating can be interpreted as extracting bits of randomness). Furthermore, in this paper, we adopt the total variation distance as the measure of accuracy which has a close relation with the expected payoff in games. We also use concentration inequalities to provide further refinements (Proposition 15).
The rest of this paper is organized as follows: In Section 2, we introduce the notations of this paper and present a brief discussion of Shannon and Rényi entropy. The repeated game with leaked randomness source is defined in Section 3, where we also provide our results on the convergence rate of the max-min payoff of games with finite number of stages. In Section 3.2, we introduce our tool for simulation of a source from another source. In Section 4, we characterize the set of approximate Nash equilibria achievable in long run. Some of the proofs are presented in Appendices.
2 Preliminaries
2.1 Notations
In this paper, we use the notation to represent a sequence of variables . The same notation is used to represent sequences of random variables, i.e., . Note that this notation is used for sequences that have two subscripts the same way, i.e., . Calligraphic letters such as represent finite sets, and denotes the cardinality of the finite set . Cartesian product of two sets and is denoted by , and stands for times cartesian product of . The set of natural numbers is represented by , and denotes the set of real numbers. For a real number , is the largest integer less than or equal to , and is the smallest integer greater than or equal to . Furthermore, let and be two real functions on the set of real numbers; we write if and only if there exists a real constant such that for all , we have . We use the notation for real sequences and in the same manner.
The probability mass function (pmf) of a random variable is represented by . When it is obvious from the context, we drop the subscript and use instead of . We say that is drawn i.i.d. from if
[TABLE]
We use to denote the probability simplex on alphabet , i.e., the set of all probability distributions on the finite set . The total variation distance between pmfs and is denoted by and is defined as:
[TABLE]
Some of the properties of the total variation distance are summarized in the following lemma.
Lemma 1**.**
The following properties hold for the total variation distance:
** Property 1:**
;
** Property 2:**
;
** Property 3:**
.
2.2 Shannon Entropy
Let and be two random variables with joint probability distribution and respective marginal distributions and . The Shannon entropy of the random variable is defined to be:
[TABLE]
where by continuity, and all logarithms in this paper are in base two. Since the Shannon entropy is a function of the pmf , we sometimes write instead of .
The conditional Shannon entropy of given is defined as:
[TABLE]
where .
The following properties hold for the entropy function:
- 2.
For arbitrary deterministic function , we have .
2.3 Rényi Entropy
Let and be two random variables with joint probability distribution and respective marginal distributions and . For arbitrary , the Rényi entropy of random variable with parameter is defined as follows:
[TABLE]
where is the -norm of . Since the Rényi entropy is a function of the pmf , we sometimes write instead of .
The conditional Rényi entropy of given with parameter is defined as:
[TABLE]
where is the conditional distribution of given .
Rényi entropy is related to Shannon entropy by the following relations:
[TABLE]
Let us fix and consider as a function of . is analytic for all , and hence, differentiable of all orders. In this paper, we are interested in the values of Rényi entropy for . Particularly, for we have:
[TABLE]
Note that , and function is convex. Therefore, Jensen’s inequality implies that is non-negative. Using the Taylor expansion, for , we have:
[TABLE]
where the remainder is bounded as
[TABLE]
where
[TABLE]
Since and are functions of , instead of them, we will sometimes write and , respectively. Similarly, for the conditional Rényi entropy and for , we have
[TABLE]
where is the remainder term, and
[TABLE]
Again, Jensen’s inequality implies that is non-negative. Moreover, the remainder is bounded as
[TABLE]
where
[TABLE]
A more detailed analysis of the Rényi entropy with respect to the parameter can be found in (Beck and Schögl, 1995, Section 5).
3 Repeated games with leaked randomness source: convergence rate
In this section, we revisit the repeated game of Gossner and Vieille (2002). Here, we focus on its general version with a leaked randomness source studied by Valizadeh and Gohari (2017). Valizadeh and Gohari (2017) characterized the max-min value of the repeated game when the number of the stages of the game tends to infinity. In contrast, we let the number of stages of the game be fixed to , and investigate the rate by which the max-min value of the -stage game converges to the long-run max-min value.
3.1 Problem statement and results
Consider an stage repeated zero-sum game between players Alice() and Bob() with respective pure action sets and . Let and be the alphabet of randomness sources of Alice and Bob, respectively, and let be a publicly known pmf on . At each stage , random variables and are drawn independent of previous drawings according to , where is observed by Alice and is observed by Bob. Then, Alice and Bob choose respective actions and . At the end of stage , players monitor the chosen actions and , and Alice gets stage payoff from Bob. In order to choose and , players use the history of their observations until stage . Let and denote the history of observation of Alice and Bob (respectively) up to stage . Then, and , where and are deterministic functions by which Alice and Bob map their observations into their actions at stage . Notice that the mappings and are deterministic which means that the only source of randomization are (for Alice) and (for Bob). We call the -tuples and the strategies of Alice and Bob, respectively. The expected average payoff for Alice up to stage induced by strategies and is denoted by :
[TABLE]
where denotes the expectation with respect to the distribution induced by i.i.d. repetitions of and strategies and . Alice wishes to maximize and Bob’s goal is to minimize it.
We will refer to the above game with “the repeated game with leaked randomness source”. Another variant of this game, called “the repeated game with non-causal leaked randomness source” is defined in the following remark.
Remark 2**.**
In the definition of the repeated game with leaked randomness source, we assumed that the randomness sources and are revealed to Alice and Bob causally as the game is played out. However, we can also consider the non-causal case in which the sources and are observed by Alice and Bob (respectively) before the game starts. In this case we have and . In order to distinguish the above two cases, we name the non-causal game as “the repeated game with non-causal leaked randomness source”.
Definition 3**.**
Let be an arbitrary real value:
Alice can secure in the stage repeated game if there exists a strategy for Alice such that for all strategy of Bob we have . The maximum of the set of payoffs that Alice can secure in the stage repeated game is called the max-min value of the -stage game.
- 2.
Alice can secure in long run if there exists a sequence of strategies for Alice such that for all sequences of strategies of Bob we have . The supremum of the set of payoffs that Alice can secure in long run is called the long run max-min value of the game.
The set of all payoffs that can be secured in long run in the repeated game with leaked randomness source is characterized by Valizadeh and Gohari (2017) and restated here as Theorem 5. Before presenting Theorem 5, we need the following definition.
Definition 4**.**
In a stage game, the security level of mixed action for Alice is denoted by , and is defined as follows:
[TABLE]
Furthermore, the maximum payoff that Alice can secure in a stage game, by playing mixed actions of entropy at most , is denoted by , and is defined as:
[TABLE]
Theorem 5** (Valizadeh and Gohari (2017)).**
Let be the upper concave envelope of defined in Definition 4. In the repeated game with leaked randomness source, Alice can secure in long run if and only if . Furthermore, in stage game, Alice can secure only if .
Theorem 5 implies that the long run max-min value of the repeated game with leaked randomness source is . Moreover, the max-min value of the -stage game is at most . In the following theorems we discuss how the max-min value of the -stage game converges to as increases.
Theorem 6**.**
In the repeated game with leaked randomness source, there exist real numbers , , and , such that the following property holds: for arbitrary sequences , and satisfying , , and , one can find a sequence of strategies such that for all sequences of strategies of Bob and for all we have
[TABLE]
We give an intuitive description of the terms in Equation (10) in Discussion 8 below. The formal proof of Theorem 6 is presented in Section 3.3.
Corollary 7**.**
In the repeated game with leaked randomness source, for each , let denote the max-min value of the -stage game. converges to with a rate of at least . To see this, let be the values in the statement of Theorem 6, and let be an arbitrary positive number such that and . Define , , and . Then, Theorem 6 implies that there exists a sequence of strategies such that for all sequences of strategies of Bob, and for all , we have
[TABLE]
To see this, observe that is decaying faster than . And
[TABLE]
Discussion 8**.**
We explain Equation (10) at an intuitive level. To generate the strategies of Theorem 6, we divide the total stages almost uniformly into blocks such that the actions of each block (besides the first block) is generated as a function of the randomness source observed during the previous block, and in all stages of the first block, an arbitrary action is played. Therefore, some payoff is lost during the first block; the term in Equation (10) corresponds with this loss. On the other hand, by dividing the total stages into blocks we get blocks of length at least . This affects the precision of the simulation of the intended distribution of actions from the randomness source observed in previous block, which is reflected in the term
[TABLE]
This equation should be compared with (2), where the exponent of the simulation error is expressed as the product of three terms: the block length, a term , and the entropy difference . The term appears in Equation (11) as the block length (the lengths of each of the blocks is at least ). The sequence is a proxy for the term . Finally, considering the last term , we see that larger entropy difference yields better simulation performance. On the other hand, requirement of a large entropy difference restricts the set of action distributions and results in a payoff loss. The sequence is responsible for this trade-off. Larger results in more loss in payoff (the term in Equation 10) but a more accurate simulation (the term in the exponent of the exponential term in Equation 10).
Next, consider the repeated game with non-causal leaked randomness source (see Remark 2), where the players observe the whole sequence of their corresponding randomness sources before the game starts. We claim the following result:
Theorem 9**.**
In the repeated game with non-causal leaked randomness source (as described in Remark 2), there exist real numbers , , and with the following property: for arbitrary sequences of positive numbers and satisfying and , there exists a sequence of strategies such that for all sequences of strategies of Bob and for all we have
[TABLE]
Proof of Theorem 9 is given in Section 3.4.
Corollary 10**.**
In the repeated game with non-causal leaked randomness source, for each , let denote the max-min value of the -stage game. converges to with a rate of at least . To see this, let be the values in the statement of Theorem 9, and let be an arbitrary positive number such that and . Define , and . Then, using similar calculations as in Corollary 7, Theorem 9 implies that there exists a sequence of strategies such that for all sequences of strategies of Bob, and for all , we have
[TABLE]
Theorem 6 and Theorem 9 provide a convergence rate for general games. However, in some special cases we can derive faster convergence rates for the max-min value of the game. The following theorem provides a special case in which an exponential convergence is obtained.
Theorem 11**.**
Let be an equilibrium strategy for Alice in the one stage game, i.e.,
[TABLE]
If , then, in the repeated game with non-causal leaked randomness source, there exist real numbers , and a sequence of strategies such that for all sequences of strategies of Bob and for all , we have
[TABLE]
The proof of Theorem 11 is provided in Section 3.5.
3.2 A technical tool: simulation of a source from another source
To prove the results of Section 3.1, we need a technical tool provided in this section. Here, we study the simulation of a desired single letter source from a given single letter source . We assume that is correlated with a side information , and we would like the generated source to be almost independent of the side information . More precisely, we have the following definition:
Definition 12**.**
Let be distributed according to , and be distributed according to . We say that the deterministic mapping simulates from with precision if we have
[TABLE]
where is the joint distribution of and .
According to the above definition, we are interested in a deterministic mapping that simulates from . However, we utilize the probabilistic method and random mappings, as a tool to ultimately prove existence of a suitable deterministic mapping. Therefore, we now define a random mapping and proceed by proving some properties for it. These properties will then lead to the construction of the desired deterministic mapping.
To specify a deterministic mapping , we need to specify the value of for all . To specify a random mapping , we need to specify the joint distribution of the random variables for .
Definition 13**.**
* is a random mapping constructed as follows: assume that for different values of are i.i.d. according to . In other words, given string of symbols for all ,*
[TABLE]
The above construction of the random mapping defines a probability measure on the set of all mappings denoted by .
Lemma 14**.**
Let be distributed according to and according to . Furthermore, let be the random mapping defined in Definition 13. Then,
[TABLE]
where is the joint distribution of and . Consequently, there exists a deterministic mapping such that for all , we have
[TABLE]
Proof of Lemma 14 is provided in Appendix A.
While the above inequality ensures the existence of a deterministic mapping where (15) holds, it does not provide an explicit mapping . An explicit construction is desirable from an algorithmic perspective. In the following, we address this issue by showing that any randomly chosen mapping would almost satisfy (15) with very high probability.
Let . The quantity is random because is random. Thus, random variable is a function of the random variable , i.e., takes value with probability . Hence, Lemma 14 implies that for all ,
[TABLE]
We claim the following bound on how concentrates around its expected value.
Proposition 15**.**
For the random variable , we have
[TABLE]
Proof of Proposition 15 is presented in Appendix B.
One application of Proposition 15 is for simulation of i.i.d. sequences. Let be i.i.d. according to , and let be i.i.d. according to . Assume that so that simulation of with arbitrary precision is possible. Let be the random mapping of Definition 13, where is replaced with . Let us choose such that (note that such a real number exists since , and Rényi entropy converges to Shannon entropy as tends to ). Let be a positive number such that
[TABLE]
Then, Lemma 14 implies
[TABLE]
where . Furthermore, from Proposition 15, for , we have
[TABLE]
The above equation along with Equation (16) and definition implies
[TABLE]
In other words, the outcome of the random mapping , with probability at least (converging double exponentially to ) will simulate with precision at most (decaying exponentially in ).
3.3 Proof of Theorem 6
Let us divide the total stages, , into blocks, where is the arbitrary sequence of natural numbers in the statement of the theorem. Let be the remainder of divided by , i.e., . Then, the number of stages in each block, , is computed as follows:
[TABLE]
In other words, first, all blocks get stages, then, the remaining stages are assigned to the first blocks.
Let and denote the sequence of actions played in block by Alice and Bob, respectively. Similarly, let and denote the sequence of random sources observed in block by Alice and Bob, respectively. We generate strategy for Alice as follows: in all stages of the first block, Alice chooses an arbitrary action ; in each block , Alice chooses her action sequence as a deterministic function of the sequence of random sources observed during the previous block, . Let us denote this deterministic function by . Thus, we have
[TABLE]
In order to fulfill the definition of the strategy , it suffices to determine the functions for . We will now determine the functions after presenting some preliminaries.
Considering the definition of the function , there exist real number and pmfs such that:
[TABLE]
Without loss of generality, we may assume that . The following lemma claims that we may assume that , and also satisfy the following equations:
[TABLE]
Lemma 16**.**
Theorem 6 holds if Equations (20) and (21) fail to hold.
Proof of the above lemma is given later in Section 3.3.1.
We identify the value of in the statement of the theorem as the one given by Equations (18) and (19). The values for , and will be identified later. Take an arbitrary sequence of positive numbers, as in the statement of the theorem, such that , for all . Let
[TABLE]
Moreover, consider an ideal distribution defined as follows for :
[TABLE]
For each , we choose to be the mapping of Lemma 14 that simulates from ; hence, for all , we have
[TABLE]
where is the joint pmf of and . Next, note that
[TABLE]
On the other hand, since are drawn i.i.d. from , we have
[TABLE]
where we used , which follows from the definition given in Equation (17). Moreover, let be a fractional approximation of defined as follows
[TABLE]
Observe that
[TABLE]
Equations (23), (24) and (25) imply
[TABLE]
Using Equations (3)-(6), we bound the exponent of the exponential term in the right-hand side of the above equation as below:
[TABLE]
where in (27) we used the fact that and . On the other hand, Equations (19) and (20) along with the fact that imply
[TABLE]
where
[TABLE]
Next, let us define
[TABLE]
Then, Equations (27) and (28) imply
[TABLE]
where (29) results from . By using (29) in (26), and simplifications and we obtain
[TABLE]
Next, let , where is the arbitrary sequence of positive real numbers in the statement of the theorem. Then, Equation (30) results in
[TABLE]
Now, we need to include the sequence of actions of Bob at the -th block () into Equation (31). To do so, note that is independent of Alice’s actions in all blocks, except for the -th block. This is because the -source is i.i.d. and is a function of . Therefore, at -th stage of block number , Bob obtains information about only through his source and prior actions . In other words, is conditionally independent of given . Since , is also conditionally independent of given . Thus,
[TABLE]
Then, utilizing the first property of total variation in Lemma 1 for random variables and we conclude from (31) that
[TABLE]
Next, by utilizing the second property of total variation in Lemma 1 for random variables and , and replacing from Equation (22) we conclude
[TABLE]
In other words, the distribution of the generated actions is in distance from the ideal distribution . Note that the ideal distribution secures payoff in the -th block. Therefore, in the -th block, the generated strategy secures payoff
[TABLE]
where . Thus, for arbitrary strategy of Bob we have
[TABLE]
where , and inequality (33) follows from Equation (18) and the fact that ; Equation (34) is implied by , and (35) results from and .
Note that Equation (20) implies that ; therefore, by replacing the value of , and defining , (35) implies the claim of the theorem.
3.3.1 Proof of Lemma 16
We need to consider the case of or .
The case of and : here, is deterministic (it outputs an action with probability ), and hence, the trivial strategy of playing in all stages secures payoff for Alice; therefore, in this case, the claim of the theorem holds with .
- 2.
The case of and : in this case, let , , and let be an arbitrary deterministic pmf. Then, , and satisfy Equations (18)-(21). Therefore, we can proceed with the proof of Theorem 6 with these assumptions.
- 3.
If , then Alice can achieve by playing a pure action, and the claim of the theorem holds with .
- 4.
If and , then, we can change to an arbitrary deterministic pmf so that Equations (18)-(21) hold. Therefore, we can proceed with the proof of Theorem 6 with these assumptions.
- 5.
If and , then, we can change to , and to a deterministic pmf such that Equations (18)-(21) hold. This is because and imply that , since otherwise, by changing we would get greater value for , which contradicts the definition of the upper concave envelope.
3.4 Proof of Theorem 9
The proof is similar to the proof of Theorem 6 with few modifications. More specifically, in a repeated game with non-causal leaked randomness source we do not need to divide the total stages into blocks; instead, we can generate all actions of Alice as a function of the whole randomness source. Let and denote the sequences of actions of Alice and Bob, and let and denote the sequences of random sources of Alice and Bob, respectively. We generate strategy for Alice such that Alice chooses her action sequence as a deterministic function of , i.e.,
[TABLE]
We will now determine the function after presenting some preliminaries.
As stated in the proof of Theorem 6 in Section 3.3, we assume that there exist real number and pmfs satisfying (18)-(21). Moreover, let
[TABLE]
and let be an ideal distribution of actions defined as follows:
[TABLE]
We choose to be the mapping of Lemma 14 that simulates from ; hence, for all we have
[TABLE]
Next, note that , and . Thus, defining , Equation 37 implies
[TABLE]
A similar argument as the one used to prove Equation (29) in Section 3.3 implies
[TABLE]
where , and
[TABLE]
By using (39) in (38), and simplification , we obtain
[TABLE]
Next, let , where is the arbitrary sequence of positive real numbers in the statement of the theorem; hence, Equation (40) results in
[TABLE]
Now, we need to include the sequence of actions of Bob () into Equation (41). Note that at each stage , Bob has access to information ; thus, given an arbitrary strategy for Bob, we have
[TABLE]
Then, using a similar argument as we used in Section 3.3 to prove Equation 32, the above equation along with Equation (41) implies
[TABLE]
In other words, the distribution of the generated actions is in distance from the ideal distribution. Note that the ideal distribution secures payoff . Therefore, we have
[TABLE]
where , , and the second inequality follows from Equation (18) along with the fact that . By replacing and defining , we obtain the claim of the theorem.
3.5 Proof of Theorem 11
The inequality along with the fact that is an equilibrium strategy for Alice in the stage game implies that
[TABLE]
If Alice could play i.i.d. according to , she would have secured payoff . Our goal is to generate the actions of Alice, , as a deterministic function of the randomness source in such a way that at every stage , the distribution of the action is almost and is almost independent of the past observations of Bob.
The strategy is defined as follows: the actions are chosen as a deterministic function of , i.e., . We will now define the mapping . Consider an ideal distribution defined as below:
[TABLE]
Let be the mapping of Lemma 14 that simulates from ; hence, for all , we have
[TABLE]
where is the joint pmf of and . Note that are drawn i.i.d. from , and is i.i.d. as well, thus, we have
[TABLE]
Furthermore, let be defined as follows
[TABLE]
Note that ; hence, . Equations (45) and (46) along with the above definition of imply
[TABLE]
Next, let Bob play an arbitrary strategy and let denote the sequence of actions of Bob. At stage , Bob generates as a function of and his previous observations and . Hence, we have
[TABLE]
Then, utilizing the first property of total variation in Lemma 1 for random variables and , we conclude from (47) that
[TABLE]
Next, by utilizing the second property of total variation in Lemma 1 for random variables and , and replacing from Equation (44) we conclude
[TABLE]
In other words, the distribution of the generated actions is in distance from the ideal distribution . Note that the ideal distribution secures payoff for Alice. Therefore, Equation (48) implies that
[TABLE]
where . Note that is an arbitrary strategy for Bob, therefore, the above inequality along with (43) implies the claim of the theorem.
4 Approximate Nash equilibria of the repeated game with leaked randomness source
In the repeated game with leaked randomness source defined in Section 3.1, we have forced the players to randomize their actions just by conditioning them to the outcomes of the random sources and . In this setting, Nash equilibria do not necessarily exist (See Hubáček et al. (2016) and Budinich and Fortnow (2011)). However, approximate Nash equilibria may exist. The goal of this section is to characterize the set of approximate Nash equilibria achievable by the randomness sources and . To proceed, consider the following definitions.
Definition 17**.**
In the stage repeated game, the strategy profile is an -Nash equilibrium if Alice (resp. Bob) can not increase (resp. decrease) the expected average payoff (defined in Equation (7)) more than (resp. ) by changing her (resp. his) strategy unilaterally.
Definition 18**.**
We say is a -Nash equilibrium payoff if for arbitrary there exists a natural number and a sequence of strategy profiles such that for all , forms a -Nash equilibrium, and .
Definition 19**.**
-Nash equilibrium is achievable in long run if for all , there exists a natural number such that for all , in the -stage repeated game, there exists a -Nash equilibrium.
We will now characterize the set of all approximate Nash equilibria of the repeated game with leaked randomness source. To do so, we first need to comment on the long run security level of the players. As stated in Theorem 5, Alice can secure arbitrary payoff in long run if and only if , where is the upper concave envelope of defined in Definition 4. Using Theorem 5, we can derive a similar result from Bob’s (the minimizer) point of view. Consider the following definitions:
Definition 20**.**
Let be an arbitrary real value:
Bob can secure in the stage repeated game if there exists a strategy for Bob such that for all strategy of Alice we have .
- 2.
Bob can secure in long run if there exists a sequence of strategies for Bob such that for all sequences of strategies of Alice we have .
Definition 21**.**
In a stage game, the security level of mixed action for Bob is denoted by , and is defined as follows:
[TABLE]
Furthermore, the minimum cost that Bob can secure in a stage game, by playing mixed actions of entropy at most , is denoted by , and is defined as:
[TABLE]
Next, by replacing the stage payoff with , and hence, considering Bob as the maximizer, we can deduce the following corollary of Theorem 5.
Corollary 22**.**
Let be the lower convex envelope of defined in Definition 21. In the repeated game with leaked randomness source, Bob can secure in long run if and only if . Furthermore, in stage game, Bob can secure only if .
Note that the functions and are respectively increasing and decreasing in . On the other hand, the minimax theorem (Von Neumann (1928)) implies that ; thus, for arbitrary and , we have . Hence, .
In the following theorem we characterize the set of achievable -Nash equilibrium payoffs in terms of the individually secured payoffs and .
Theorem 23**.**
In the repeated game with leaked randomness source defined in Section 3.1, is a -Nash equilibrium payoff if and only if
[TABLE]
where and are the maximum and minimum entries of the payoff table, respectively (, and ).
The proof of Theorem 23 is provided in Section 4.1.
Remark 24**.**
If , the set of payoffs satisfying (51) is empty unless and are large enough such that
[TABLE]
where . In this case, if (52) holds, the only equilibrium payoff is , i.e., the max-min value of the stage game. This particular result coincides with the result of ”Folk-Theorem” for two-player zero-sum repeated games in which players can freely randomize their actions.
We can refine Theorem 23 to characterize the set of and for which -Nash equilibrium is achievable in long run. Let and be arbitrary positive numbers. If there exists a satisfying Equation (51), then, Theorem 23 implies that -Nash equilibrium is achievable. On the other hand, let form an -Nash equilibrium; then, Theorem 23 implies that satisfies (51). In other words, -Nash equilibrium is achievable if and only if there exists a real number satisfying (51). Therefore, by removing from Equation (51), and rewriting it in terms of and , we conclude the following corollary of Theorem 23.
Corollary 25**.**
In the repeated game with leaked randomness source defined in Section 3.1, -Nash equilibrium is achievable in long run if and only if
[TABLE]
4.1 Proof of Theorem 23
We prove that inequality (51) is both necessary and sufficient for to be a -Nash equilibrium payoff.
Inequality (51) is necessary: In the stage repeated game, let and be arbitrary strategies for Alice and Bob generating an -Nash equilibrium. According to Corollary 22, given the strategy for Bob, there exists a strategy for Alice such that . Hence,
[TABLE]
But is a -Nash equilibrium, thus, we should have
[TABLE]
Similarly, Theorem 5 implies that given the strategy for Alice, there exists a strategy for Bob such that . Hence,
[TABLE]
But is a -Nash equilibrium, thus, we should have
[TABLE]
On the other hand, since is a convex combination of the entries of the payoff table, we have ; this fact along with Equations (54) and (55) implies that
[TABLE]
For to be achievable, the above relation should be satisfied for , and arbitrarily close to , and , respectively. Thus, Equation (51) must hold.
Inequality (51) is sufficient: Let , and be real numbers satisfying (51). Equation (51) implies that ; hence, can be expressed as a convex combination of the entries of the payoff table; i.e., there exist action profiles , and non-negative numbers summing to one such that
[TABLE]
Let us approximate each by a rational number such that ; for arbitrary , we can choose large enough such that
[TABLE]
where . We take so large that not only inequality (56) is satisfied, but also there exist strategies and such that in the -stage repeated game, secures expected average payoff of for Alice, and secures for Bob (Such strategies and exist since in long run, Alice can secure , and Bob can secure ).
Now, we are ready to construct the desired approximate Nash equilibrium . Let the total stages of the game be of the form , and let us divide the total stages into blocks of length . The value of will be set in the sequel. In each block, Alice and Bob cycle through the action profiles such that each action profile is repeated in stages. Note that the actions are deterministic, thus, each player can monitor the actions of the other player to see if he/she is still following the rule or not. If Alice (resp. Bob) deviates the rule, then, in the upcoming blocks, Bob (resp. Alice) plays according to the strategy (resp. ) to secure payoff (resp. ).
When Alice and Bob both play according to respective strategies and , the expected average payoff equals , i.e.,
[TABLE]
Next, we show that the strategy profile forms the desired approximate Nash equilibrium. Let Alice deviate from strategy , and play an arbitrary strategy . Furthermore, let the deviation be detected by Bob at block . The expected average payoff will be in the blocks before the -th block, and the payoff of the blocks after the -th block (where Bob plays ) will be at most . In the -th block, Alice could get at most , thus,
[TABLE]
Equation (58) along-with Equation (57) implies:
[TABLE]
where (59) follows from Equation (56), and the fact that . On the other hand, Equation 51 implies:
[TABLE]
Equations (59) and (60) imply:
[TABLE]
By a similar argument we can also show that for arbitrary strategy for Bob we have
[TABLE]
Inequalities (61) and (62) imply that the strategy profile forms a -Nash equilibrium with expected average payoff . We can choose small enough and large enough to make as small as desired, and hence, as close to as desired (according to (56)). Thus, is an -Nash equilibrium payoff.
4.2 Approximate Nash equilibria achieved by autonomous strategies
We call a strategy an autonomous strategy if the action of each stage is indifferent about the actions of the opponent in the previous stages. Formally, in the stage repeated game with leaked randomness sources defined in Section 3.1, strategies and are autonomous if for arbitrary and arbitrary histories , , and we have
[TABLE]
Autonomous strategies are sufficient for construction of a Nash equilibrium for two-player zero-sum repeated games, where players can freely randomize their actions. Furthermore, in the repeated game with leaked randomness source, the maximum securable payoff of Alice (the max-min payoff) can be secured by an autonomous strategy (see (Valizadeh and Gohari, 2017, Section 3.3)). Therefore, we are also interested in the set of approximate Nash equilibria achievable by the class of autonomous strategies.
In this section, we characterize the set of approximate Nash equilibria achievable by autonomous strategies in a simplified version of the repeated game with leaked randomness source. In the simplified version, we assume that the randomness sources and are independent, i.e., , thus, we call it the repeated game with independent randomness sources. It will turn out that the set of approximate Nash equilibria achievable by autonomous strategies is strictly smaller than the set of all achievable approximate Nash equilibria in Corollary 25. To proceed we need the following definition.
Definition 26**.**
-Nash equilibrium is achievable by autonomous strategies if for all there exists a natural number and a sequence of autonomous strategy profiles such that for all , forms a -Nash equilibrium in the stage repeated game.
Theorem 27**.**
In the repeated game with independent randomness sources, -Nash equilibrium is achievable by autonomous strategies if and only if there exist random variables , and such that and
[TABLE]
where and are defined as follows
[TABLE]
Proof of Theorem 27 is provided in Appendix C.
Example 28**.**
Consider a repeated game with independent randomness sources and such that , and . The sets of actions of Alice and Bob are , and the payoff table is as follows:
[TABLE]
Since , Alice must play deterministic actions by which she can secure at most ; hence, . On the other hand, Bob has access to one bit randomness per stage, thus, in each stage, he can play according to the max-min strategy of the one shot game and secure [math], hence, . Consequently, Corollary 25 implies that -Nash equilibrium is achievable if and only if . Hence, -Nash equilibrium is achievable. It is straightforward to check that and does not satisfy the conditions of Theorem 27; therefore, in the repeated game of this example, -Nash equilibrium is not achievable by autonomous strategies.
Remark 29**.**
In the repeated game of Example 28, the set of approximate Nash equilibria achievable by autonomous strategies is strictly smaller than the set of approximate Nash equilibria achievable by arbitrary strategies. Therefore, for achieving approximate equilibria of the repeated games with leaked randomness source, autonomous strategies are not sufficient.
Appendix Appendix A Proof of Lemma 14
For arbitrary we have
[TABLE]
where is the indicator function, and Equation (A.1) follows from
[TABLE]
and reordering the summations. Equation (A.2) follows from , and . Inequality (A.3) is implied by utilizing the Jensen’s inequality for concave function .
Next, we claim that for arbitrary and ,
[TABLE]
Therefore, Equations (A.3) and (A.4) imply
[TABLE]
The above equations fulfills the proof. Thus, we only need to prove the claim of Equation (A.4). Instead of proving Equation (A.4), we prove Equation (A.5) which is obtained by replacing with an arbitrary real function :
[TABLE]
In order to interpret the above inequality, let us define -finite measure spaces and , where for all , , and for all , . Furthermore, let be a linear operator that maps (the set of real valued functions on ) to (the set of real valued functions on ) and is defined as below
[TABLE]
Moreover, consider the following definition:
Definition 30**.**
Let and be -finite measure spaces, and be a real function on the measure space . For arbitrary , the -norm of is denoted by and is defined as below:
[TABLE]
* denotes the set of real functions with bounded -norm, i.e., . For arbitrary , let be an operator that maps the real functions on the measure space to the real functions on the measure space . denotes the operator norm of defined as follows:*
[TABLE]
Using the above definition and Equation (A.6), we can rewrite Equation (A.5) as follows
[TABLE]
In order to prove Equation (A.7), it suffices to prove it for the special cases and , then, the general form with arbitrary will be concluded from the well-known Riesz-Thorin interpolation theorem.
Theorem 31** (Riesz-Thorin Interpolation Theorem).**
Let and be arbitrary -finite measure spaces. Suppose , , and let be an arbitrary linear operator that maps and boundedly into and , respectively. For arbitrary , let and , then, maps boundedly into and satisfies the operator norm estimate
[TABLE]
We complete the proof by proving Equation (A.7), or equivalently Equation (A.5), for and . For , we have
[TABLE]
where (A.8) follows from the property of the random mapping that . Therefore, Equation (A.5) holds for .
For we have
[TABLE]
where (A.10) follows from the fact that for distinct and , is independent of , and . Equation (A.11) is implied by .
Appendix Appendix B Proof of Proposition 15
is fully described by its elements , and hence, is a deterministic function of , i.e.,
[TABLE]
where is a deterministic function defined as follows:
[TABLE]
Since is a function of independent random variables, we utilize the McDiarmid’s inequality (McDiarmid (1989)).
Let be two arbitrary mappings with equal assignments for all elements of except for some element , i.e.,
[TABLE]
Then, we have:
[TABLE]
Therefore, we have
[TABLE]
Furthermore, recall that are independent random variables. Hence, McDiarmid’s inequality (McDiarmid (1989)) implies:
[TABLE]
Appendix Appendix C Proof of Theorem 27
In Appendix C.1 we show that provided the conditions of Theorem 27, -Nash equilibrium is achievable by autonomous strategies (achievability proof), and in Appendix C.2, we show that if the autonomous strategy profile forms an -Nash equilibrium, then, and satisfy the conditions of Theorem 27 (converse proof).
Appendix C.1 Achievability proof
Let , and be the random variables in the statement of Theorem 27 for which the entropy constraints hold strictly, i.e., , and . Take of the form , and divide the total stages into blocks of stages. We generate the action sequence of each block (except for the first block) as a function of the random source observed during the previous block. In all stages of the first block, fixed actions and are played by Alice and Bob, respectively. Furthermore, excluding the first block, each block is further divided into four subblocks each of which include the following set of stages:
[TABLE]
We generate strategies of Alice and Bob in such a way that they use the randomness source observed in last block to generate the actions of current block. We would like the generated actions of Alice and Bob to be almost i.i.d. according to respective distributions and during each subblock for all . Moreover, the action played in each stage should be also independent of the other player’s observations up to that stage. As and , intuitively, each player could generate his/her actions with the intended pmf as a function of his/her corresponding randomness source observed during the previous block. Then, and would imply that in limit, the constructed strategies form an -Nash equilibrium. We will now present the above sketch of proof more precisely.
For arbitrary , let and denote the actions of Alice and Bob, respectively, and let and be the randomness sources observed by Alice and Bob, respectively, during block number . Let us construct the strategies for Alice and for Bob as follows: Alice and Bob choose their actions in each block as a deterministic function of the corresponding sequence of random sources observed during the previous block, i.e., block number . Particularly, there exist a sequence of deterministic mappings and such that for all we have , and . In all stages of the first block, both Alice and Bob choose an arbitrary fixed action. Next, we fulfill the specification of the strategies and by specifying the functions and .
Let and be ideal distributions defined as follows:
[TABLE]
For arbitrary , let be the mapping of Lemma 14 that simulates from , and let be the mapping of Lemma 14 that simulates from . Thus, for all and , we have
[TABLE]
Note that , and
[TABLE]
where the first inequality follows from and ; the second inequality holds for sufficiently large . Therefore, Equation (C.3) implies
[TABLE]
Note that ; thus, there exits a such that . Therefore, Equation (C.5) implies that for arbitrary , one can choose large enough so that
[TABLE]
A similar argument concludes that for sufficiently large we have
[TABLE]
Note that is independent of , thus, is independent of , i.e., . This fact along with Equations (C.6), (C.7) and (C.2) implies
[TABLE]
where we utilized the third property of total variation in Lemma 1. Note that is the actual distribution of actions in block number , while is the ideal one. Next, for arbitrary , we have
[TABLE]
where the first inequality follows from the following two facts:
The expected payoff of the first block is bounded below by , where . 2. 2.
Inequality C.8 implies that excluding the first block, the expected payoff of each block is in at most distance of the expected payoff induced by the ideal distribution of actions.
The second inequality holds for sufficiently large and , because tends to , and tends to zero as and tend to infinity.
Next, we prove that the strategies and constructed above form the desired approximate Nash equilibrium. Consider an arbitrary strategy (not necessarily an autonomous strategy) for Alice, and let Alice and Bob play according to strategy profile . In this case, let and denote the sequence of actions of Alice and Bob in block number ; moreover, let and denote the sequence of randomness sources observed during block number . Observe that is an autonomous strategy, thus, changing the strategy of Alice from to has not any impact on the actions of Bob; hence, the sequence of actions of Bob still satisfies the property of Equation (C.7), i.e.,
[TABLE]
Note that at -th stage of block number , Alice finds information about just through ; thus, is independent of given . On the other hand, is a deterministic function of . Therefore, is also independent of given . Hence,
[TABLE]
Equations (C.10) and (C.11) along with the first property of total variation in Lemma 1 imply that for all , we have
[TABLE]
Note that the ideal distribution guarantees that the expected payoff of block number is no more than
[TABLE]
On the other hand, Equation (C.12) implies that the actual expected payoff of arbitrary block number is in distance of the ideal one. Thus, the actual expected payoff of block number is no more than
[TABLE]
Furthermore, the expected payoff of the first block is bounded above by ; thus,
[TABLE]
where the second inequality holds for sufficiently large and .
Equations C.9 and C.13 conclude
[TABLE]
where the second inequality follows from the assumption in the statement of the theorem that . By a similar argument as above, we can show that for arbitrary strategy for Bob we have
[TABLE]
Inequalities (C.14) and (C.15) imply that the strategy profile forms a -Nash equilibrium. But one can choose and small enough to make as small as desired; thus, -Nash equilibrium is achievable.
Appendix C.2 Converse proof
Let be an autonomous strategy profile generating an -Nash equilibrium in the stage repeated game, and let and be the sequence of actions of Alice and Bob. Let be a random variable chosen from uniformly and independent of . Let us define
[TABLE]
We show that the random variables along with and satisfy the conditions of Theorem 27.
The Markov conditions: In this part, our goal is to show that is independent of given ,i.e., . To do this, it suffices to prove that for all , is independent of given . Note that the strategies and are autonomous, hence, is a deterministic function of , and is a deterministic function of ; thus, the fact that is independent of implies that is independent of . Therefore, is independent of given .
Entropy conditions: We show that :
[TABLE]
where (C.16) follows from the independence of from . Recall that is independent of , thus, –a deterministic function of – is independent of –a deterministic function of . Hence, Equation (C.17) is correct. Equation (C.18) is implied by the fact that is a deterministic function of . A similar argument justifies .
Equilibrium conditions: In this part we show that . To do this, we consider a new game in which Alice and Bob play according to strategy profile , where will now be constructed. In the new game, let and denote the respective actions of Alice and Bob; furthermore, let and denote the respective sources of randomness of Alice and Bob, and denote the history of observations of Alice until stage . Note that is the same strategy as considered in the beginning of the converse proof, whereas is generated as follows: given , an arbitrary history of observations of Alice until stage , is the best choice of Alice that maximizes the expected payoff at stage , i.e.,
[TABLE]
where denotes the expectation with respect to the probability distribution induced by and . We have
[TABLE]
where Equation (C.19) results from the following two facts:
is a deterministic function of , thus, is independent of given . 2. 2.
is an autonomous strategy, thus, is a deterministic function of . On the other hand, is independent of . Therefore, is independent of .
Note again that is an autonomous strategy, thus, the probability distribution of the actions of Bob is not related to the strategy of Alice; hence, (recall that is the sequence of actions of Bob in the original game in which strategy profile is played); as a result, Equation (C.20) holds. Since and are autonomous strategies, is a deterministic function of , and is a deterministic function of . This fact along with the independence of from implies that is independent of , thus Equation (C.21) follows.
Furthermore, the expected average payoff induced by equals
[TABLE]
Equations (C.22) and (C.23) imply
[TABLE]
The above equation along with the fact that is an -Nash equilibrium implies that . Using a similar argument as above, one can show that .
Cardinality bound: The identified random variables satisfy the constraints of the theorem, except the cardinality bound on . Cardinality of can be reduced using standard arguments such as the support lemma of (El Gamal and Kim, 2011, Appendix C), or the Fenchel-Bunt extension to the Caratheodory’s theorem. We modify and generate a new distribution so that it also satisfies the cardinality bound. Let ; this guarantees that under , given , is independent of . Next, we complete the definition of by specifying the marginal distribution . We can perceive as a real vector satisfying the following linear constraints:
[TABLE]
where is the sample space of the random variable .
Let denote the entropy of given , under the distribution . Similarly, the prime superscript in , and indicates that they are computed according to probability distribution . We drop the superscript when the evaluation is done under the original probability distribution . Assume that also satisfies the following linear constraints:
[TABLE]
Let denote the polytope of marginal distributions satisfying (C.24)-(C.28). Note that is not empty since it contains . We choose to be an element of that minimizes . This guarantees that inherits the following properties from :
[TABLE]
We will now show that also satisfies the cardinality bound on the support of . Note that is linear in , hence, it’s minimum occurs in a vertex of the polytope . polytope lies in a dimensional space, and hence, each of it’s vertices lies in at least out of the hyperplanes defining (Equations (C.24)-(C.28)) . Therefore, , which is a vertex of , lies in at least out of hyperplanes of the form (C.24). Hence, has at most 4 non-zero elements.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Beck and Schögl (1995) Beck, C., Schögl, F., 1995. Thermodynamics of chaotic systems: an introduction. No. 4. Cambridge University Press.
- 2Bernardini and Rinaldo (2018) Bernardini, R., Rinaldo, R., 2018. Generalized elias schemes for efficient harvesting of truly random bits. International Journal of Information Security 17 (1), 67–81.
- 3Budinich and Fortnow (2011) Budinich, M., Fortnow, L., 2011. Repeated matching pennies with limited randomness. In: Proceedings of the 12th ACM conference on Electronic commerce. ACM, pp. 111–118.
- 4El Gamal and Kim (2011) El Gamal, A., Kim, Y.-H., 2011. Network information theory. Cambridge university press.
- 5Elias (1972) Elias, P., 1972. The efficient construction of an unbiased random sequence. The Annals of Mathematical Statistics, 865–870.
- 6Gossner and Vieille (2002) Gossner, O., Vieille, N., 2002. How to play with a biased coin? Games and Economic Behavior 41 (2), 206–226.
- 7Han (2003) Han, T. S., 2003. Information-spectrum methods in information theory. Vol. 50. Springer -Verlag Berlin Heidelberg.
- 8Hayashi (2011) Hayashi, M., 2011. Exponential decreasing rate of leaked information in universal random privacy amplification. IEEE Transactions on Information Theory 57 (6), 3989–4001.
