Concerning an adversarial version of the Last-Success-Problem
Jos\'e Mar\'ia Grau Ribas

TL;DR
This paper analyzes an adversarial variation of the Last-Success-Problem involving two players who sequentially observe Bernoulli variables and decide whether to continue or pass, determining optimal strategies and winning probabilities.
Contribution
It introduces an adversarial version of the Last-Success-Problem, deriving optimal strategies and characterizing winning probabilities under different parameter settings.
Findings
Optimal strategies for both players are established.
Probability of Player A winning decreases with increasing n for uniform parameters.
As n grows, Player A's winning probability approaches approximately 0.4323.
Abstract
There are independent Bernoulli random variables with parameters that are observed sequentially. Two players, A and B, act in turns starting with player A. Each player has the possibility on his turn, when , to choose whether to continue with his turn or to pass his turn on to his opponent for observation of the variable . If , the player must necessarily to continue with his turn. After observing the last variable, the player whose turn it is wins if , and loses otherwise. We determine the optimal strategy for the player whose turn it is and establish the necessary and sufficient condition for player A to have a greater probability of winning than player B. We find that, in the case of Bernoulli random variables with parameters , the probability of player A winning is decreasing with towards its limit $\frac{1}{2} -…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Distributed systems and fault tolerance · Markov Chains and Monte Carlo Methods
Concerning an adversarial version of the Last-Success-Problem
J.M. Grau Ribas
Departamento de Matemáticas, Universidad de Oviedo
Avda. Calvo Sotelo s/n, 33007 Oviedo, Spain
Abstract.
There are independent Bernoulli random variables with parameters that are observed sequentially. Two players, A and B, act in turns starting with player A. Each player has the possibility on his turn, when , to choose whether to continue with his turn or to pass his turn on to his opponent for observation of the variable . If , the player must necessarily to continue with his turn. After observing the last variable, the player whose turn it is wins if , and loses otherwise. We determine the optimal strategy for the player whose turn it is and establish the necessary and sufficient condition for player A to have a greater probability of winning than player B. We find that, in the case of Bernoulli random variables with parameters , the probability of player A winning is decreasing with towards its limit . We also study the game when the parameters are the results of uniform random variables, .
Key words and phrases:
Keywords: Last-Success-Problem; Odds-Theorem; Optimal stopping; Optimal threshold
AMS 2010 Mathematics Subject Classification 60G40, 62L15
1. Introduction
The Last-Success-Problem (LSP) is the problem of maximizing the probability of stopping on the last success in a finite sequence of Bernoulli trials. There are Bernoulli random variables which are observed sequentially. The problem is to find a stopping rule to maximize the probability of stopping at the last ”1”. This problem has been studied by Hill and Krengel [4] and Hsiau and Yang [5] for the case in which the random variables are independent and was simply and elegantly solved by T.F. Bruss in [1] with the following famous result.
Theorem 1**.**
(Odds-Theorem, T.F. Bruss 2000). Let be independent Bernoulli random variables with known . We denote by () the parameter of ; i.e. (). Let and . We define the index
[TABLE]
To maximize the probability of stopping on the last in the sequence, it is optimal to stop on the first that we encounter among the variables .
The optimal win probability is given by
[TABLE]
We propose the following adversarial version of the problem in this paper. There are independent Bernoulli random variables with parameters that are observed sequentially. Two players, A and B, act in turns starting with A. After observing the value of , if , then the player whose turn it is may pass his turn to his opponent or use it and observe the variable . When the last event is reached, if the result is success (), the player whose turn it is wins, and loses otherwise. Specifically, if for all , player A loses. This is reminiscent of the hot potato game in which the goal is not to be holding the hot potato at the end of the game, with the rule of being able to pass it on (if one so wishes) to one’s opponent when .
Let us denote by the probability of the player whose turn it is winning when we are about to observe the variable . In particular, the probability of player A winning is ; hence the probability of player B winning is . Likewise, on observing the last random variable, the player whose turn it is will win with probability , i.e. .
The dynamic program to find the optimal strategy is straightforward. After observing the variable , if , which occurs with probability , the player then irrevocably goes on to observe the variable without giving up his turn. If , the optimal strategy of the player whose turn it is will consist in passing his turn to his opponent if and in continuing with his turn if . We shall then have the following recurrence.
[TABLE]
2. Optimal strategy
We shall see that the optimal strategy is extremely simple and that it is also very easy to determine which of the two players has the greatest probability of winning. Another matter altogether is the exact calculation of this probability, which generally requires the computation of recurrence or calculations of the equivalent cost.
Proposition 1**.**
If for all , , then for all the following is fulfilled:
[TABLE]
Proof.
It is evident that . We proceed by backward induction. We assume that the proposition is true for all and shall prove that it also holds for . From the induction hypothesis, , therefore , and hence
[TABLE]
On the other hand, considering that
[TABLE]
it turns out that
[TABLE]
∎
Proposition 2**.**
If and for all , , then:
[TABLE]
Proof.
We will take into account that
[TABLE]
For Proposition 3, , so
[TABLE]
[TABLE]
Now, since , it is immediately followed by the dynamic program that
[TABLE]
∎
Proposition 3**.**
If and for all , , then:
[TABLE]
Proof.
For Proposition 3, , so
[TABLE]
Now, since , it follows immediately from the dynamic program that
[TABLE]
∎
From the previous propositions the following result is followed without difficulty.
Proposition 4**.**
Let and considering
[TABLE]
The optimal strategy for the player whose turn it is when observing the variable is not to give up his turn before stage and to do so when he may starting from . In addition, the following is true.
* If , then .*
* If and , then .*
* If and , then .*
Let us denote by the optimal threshold of the first player in his first turn, (the last Bernoulli event with parameter ). The optimal strategy of the first player consists in continuing with his turn until reaching the -th event and thereafter giving up his turn whenever possible. Obviously, player B will do the same in his optimal game because, when his turn comes, he will be in the same situation as player A. In short, we have the following result.
Theorem 2**.**
The optimal strategy for both players is to give up their turn when (and only when) there are no random variables left to observe whose parameter is greater than or equal to .
In fact, when played optimally by both players, the game can be seen as a game of solitaire played by player A assuming his opponent uses the optimal strategy. Thus, the probability of player A winning, the optimal threshold being , is the probability that the number of starting from the resulting from the random variants is odd; that is to say:
[TABLE]
This allows establishing a (somewhat coarse) lower bound for the probability of player A winning. Bear in mind that the win probability in this game is greater than the probability of winning in the LSP.
Proposition 5**.**
If , then
[TABLE]
Proof.
It suffices to keep in mind that the probability of winning in the LSP under these conditions is greater than (see [2]). ∎
3. All the random variables have the same parameter
In this section, we study the particular case that all the Bernoulli random variables have the same parameter.
Proposition 6**.**
If for all , then the probability of player A winning is strictly increasing with always below its limit as tends to infinity
[TABLE]
Proof.
If is even, we have
[TABLE]
Similarly, if n is odd, we have
[TABLE]
∎
Proposition 7**.**
If we have Bernoulli random variables with , the probability of player A winning is decreasing and is always greater that its limit as tends to infinity, namely
[TABLE]
Proof.
If then . If then . If , the optimal strategy for both players is to give up their turn whenever possible. Hence, player A will win if the number of resulting from the random variables is odd.
If is even
[TABLE]
[TABLE]
Similarly, if is odd
[TABLE]
[TABLE]
[TABLE]
∎
Proposition 8**.**
If we have Bernoulli random variables with , the probability of player A winning is greater than
[TABLE]
Proof.
If for some , then . Otherwise, think of the auxiliar game with all the parameters equal to in which the probability is greater than . Now, there is no more to considering successive modifications of this game, as in Lemma 1 (see below), with which the win probability increases, until reaching the game considered. ∎
Proposition 9**.**
If we have Bernoulli random variables, of which there are with , the probability of player A winning is greater than
[TABLE]
Proof.
It can easily be seen that if for all , then the probability of player A winning is greater than the probability that he would have in the game resulting from excluding some random variable. Consequently, it suffices to observe that the value, , is exceeded in the auxiliary game resulting from excluding some random variables. In fact, if we have variables with for all of these variables, considering the game in which the other variables are excluded, then we are able to use the previous proposition. ∎
Lemma 1**.**
Let us consider the game with parameters and denote by the probability of a player winning when it is his turn after observing the variable in the resulting auxiliary game when changing to . Hence,
[TABLE]
[TABLE]
In other words, if we increase the value of the parameter of one of the Bernoulli random variables in a game, then the player’s probability of winning on his turn increases.
Proof.
Let us recall that and respectively denote the player’s probability of winning on his turn at stage in the original game and in the auxiliary game.
For all , it is evident that as we are in a subsequent stage to the modified variable and the process “has no memory” and therefore does not affect.
For all , we will proceed by induction backwards. Let us first see that it is true for .
[TABLE]
[TABLE]
[TABLE]
We now assume that the proposal is fulfilled for and shall prove that it is fulfilled for
[TABLE]
[TABLE]
[TABLE]
∎
3.1. A variant: If there have been no , the game is
repeated
We have seen that the game is advantageous for player A if and only if some parameter is greater than . The reason that the game is disadvantageous for player A is related to the fact that he can lose because the results of all the random variables are [math]. In fact, if any of the variables is worth , then the probability of the player winning by giving up his turn is greater than . The following result shows that, if the rule of repeating the game is introduced and if there have been no , then the game is very advantageous for player A.
Proposition 10**.**
If for all and considering the rule that the game is repeated in the case of for all , the probability of player A winning is
[TABLE]
Besides, we have that is increasing and
[TABLE]
Proof.
Obviously, the optimal strategy with this rule is the same as in the game in its original version. The difference lies only in the probability of winning, which is conditioned by . Thus, bearing in mind that
[TABLE]
we have that
[TABLE]
Moreover, is increasing and its limit is . ∎
4. Random parameters for the Bernoulli variables
We finally determine the probability of player A winning (mean probability) when the parameters are the results of uniform random variables, . That is to say, before holding the competition, the parameters of the Bernoulli variables are drawn via random trials of a uniform random variable and these parameters are revealed to the players.
Lemma 2**.**
If the last variables have parameters , which are the results of the uniform random variables , then the win probability of the player whose turn it is at stage is
[TABLE]
In particular, if
[TABLE]
Proof.
We denote by the win probability of player whose turn it is at stage . Bearing in mind that is the result of a uniform random variable,
[TABLE]
[TABLE]
and solving with gives
[TABLE]
∎
Lemma 3**.**
If we have end variables with parameters that are the result of uniform random variables, , and is the result of a uniform random variable, , then the probability of player A winning is
[TABLE]
Proof.
We denote by the win probability of the player whose turn it is after . Reasoning similar to above
[TABLE]
[TABLE]
[TABLE]
∎
Lemma 4**.**
If all the parameters are the result of uniform random variables, , then player A’s probability of winning is
[TABLE]
Proposition 11**.**
If we have a game with variables whose parameters are the result of the uniform random variables, , player A’s probability of winning is
[TABLE]
Proof.
The probability that the last parameter greater than or equal to will be the -th is and the probability that all the parameters are less than will be .
[TABLE]
∎
5. Conclusions and future challenges
The proposed adversarial version of the Last-Success-Problem has a very simple optimal game strategy that does not require any calculation. It only requires identifying the last variable whose parameter is greater than and, as from that point on, always giving up one’s turn to one’s opponent. It seems interesting to pose the problem with non-independent random variables. The Last-Success-Problem with dependent Bernoulli random variables was addressed by Tamaki in [8], who considered that constitute a Markov chain with transition probabilities
[TABLE]
[TABLE]
and established an optimal stopping rule with a Markov version of the odds-theorem.
We predict that the adversarial version with dependent variables will also be simple and the optimal strategy will most likely consist in adopting, at each -th stage, the optimal strategy while assuming that the remaining variables are independent with parameters (computable by recurrence)
[TABLE]
In short, we conjecture the following.
Conjecture 1**.**
Let be dependent Bernoulli random variables. Let . Then, the optimal strategy for the player whose turn it is after observing the variable is to give up his turn to his opponent if and only if for all .
It may also be interesting to pose the game with more than 2 players, in which case different types of payment could be considered. For any version, it is normal to consider the loser to be the player whose turn it is after the last Bernoulli trial, but several types of payment may be considered for the other players. If we consider that the players who do not lose each receive the same payment, we have the simplest version. In this respect, we conclude by posing the challenge to determine the limit with players, when tends to infinity, of the loss probability of each player, considering independent Bernoulli random variables with parameters . In fact, for 3 players, it is no longer a trivial problem, as only the limit for the probability of the first player losing is exactly calculable in a relatively straightforward way. Using the Mathematica symbolic calculation package, we obtained the following limit for the probability of the first player losing:
[TABLE]
However, it is no longer viable to find the exact limit of the probability of the other two players losing via this path. Computing for large values of allows an approximation, but only that. Specifically, we have that:
[TABLE]
In all the above calculations, we have assumed that the optimal strategy for both players is to give up their turn whenever possible. Of course, this will undoubtedly be true in this case. In general, however, there will be an optimal strategy that does not always consist in passing one’s turn to the following player.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Bruss, F.T. (2000) Sum the odds to one and stop. Ann. Probab. 28, no. 3, 1384–1391.
- 2[2] Bruss, F.T. (2003) A note on bounds for the odds theorem of optimal stopping. Ann. Probab. 31, no. 4, 1859–1861.
- 3[3] Dendievel, Rémi. (2013) New developments of the odds-theorem. Mathematical Scientist . Vol. 38 Issue 2, 111-123.
- 4[4] Hill T. P. and Krengel, U. (1992). A prophet inequality related to the secretary problem. Contemp. Math . 125 209–215.
- 5[5] Hsiau, S. R. and Yang, J. R. (2000). A natural variation of the standard secretary problem. Statist. Sinica . 10. 639-646
- 6[6] Katsunori Ano, Hideo Kakinuma and Naoto Miyoshi (2010) Odds theorem with multiple selection chances. Journal of Applied Probability , vol. 47, no. pp. 1093-1104
- 7[7] Tamaki, M. (2010) Sum the multiplicative odds to one and Stop. Journal of Applied Probability , vol. 47, no. 3 pp. 761-777
- 8[8] Tamaki, M. (2006) Markov version of Bruss’ odds-theorem (the development of information and decision processes). RIMS Kokyuroku, 1504:184-187.
