Terminal Ranking Games
Erhan Bayraktar, Yuchong Zhang

TL;DR
This paper studies a mean field game where agents are rewarded based on project rankings, using Schrödinger bridges to explicitly find equilibria and analyze the impact of reward structures on welfare and efficiency.
Contribution
It introduces a novel application of Schrödinger bridges to explicitly compute equilibria in ranking-based mean field games and addresses mechanism design and welfare analysis.
Findings
Explicit equilibrium calculations using Schrödinger bridges.
Identification of reward functions for desired equilibria.
Analysis of reward inequality effects on welfare and efficiency.
Abstract
We analyze a mean field tournament: a mean field game in which the agents receive rewards according to the ranking of the terminal value of their projects and are subject to cost of effort. Using Schr\"{o}dinger bridges we are able to explicitly calculate the equilibrium. This allows us to identify the reward functions which would yield a desired equilibrium and solve several related mechanism design problems. We are also able to identify the effect of reward inequality on the players' welfare as well as calculate the price of anarchy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Terminal Ranking Games††thanks:
Erhan Bayraktar is supported in part by the NSF under grant DMS-1613170 and by the Susan M. Smith Professorship. We are grateful to Jakša Cvitanić for stimulating discussions.
Erhan Bayraktar Department of Mathematics, University of Michigan, 530 Church Street, Ann Arbor, MI 48104, USA, [email protected]
Yuchong Zhang Department of Statistical Sciences, University of Toronto, 100 St. George Street, Toronto, Ontario M5S 3G3, Canada, [email protected]
Abstract
We analyze a mean field tournament: a mean field game in which the agents receive rewards according to the ranking of the terminal value of their projects and are subject to cost of effort. Using Schrödinger bridges we are able to explicitly calculate the equilibrium. This allows us to identify the reward functions which would yield a desired equilibrium and solve several related mechanism design problems. We are also able to identify the effect of reward inequality on the players’ welfare as well as calculate the price of anarchy.
Keywords: Tournaments, rank-based rewards, mechanism design, mean field games, price of anarchy, Schrödinger bridges, Lorenz order.
2020 Mathematics Subject Classification: 91A16, 91B43, 93E20
1 Introduction
Consider the following tournament: each player (indexed by ) exerts an effort, which we denote by , to move the value of her project/state, which is modeled as a drifted Brownian motion:
[TABLE]
We assume are independent. The cost of effort per unit time is assumed to be quadratic in with coefficient . The game ends at time , when each player receives a reward that is a deterministic function of three components:
- •
Her terminal project value ;
- •
The ranking of relative to other players, measured by the fraction of players having equal or worse performance (so that the top performer has rank one and the bottom performer has rank );
- •
Statistics of the population performance, such as population mean or the -th order statistic of or both. This allows us to cover the case when the “reward pie” is not fixed, but grows with the total production or the -th best performance. For simplicity of the presentation, we only consider dependence via the population mean.
In this paper, we will analyze the mean field game associated with the above -player game, and explicitly characterize the equilibrium (see Section 3), improving on the results of Bayraktar and Zhang [2] which dealt only with the abstract existence and uniqueness of the mean-field equilibrium. Analysis of mean-field games is useful in solving -player games when is large, since it has been shown in Bayraktar and Zhang [2] that the mean-field equilibrium can be used to construct an approximate Nash equilibria for the finite player games.
Our explicit characterization, which is rare in mean field games, allows us to solve tournament design problems. Specifically, we determine in Section 5 the reward functions that maximize the rank- performance, the net profit (for the tournament planner), and the total effort, respectively. Moreover, in Section 6, we also compute the so-called price of anarchy which measures the efficiency loss due to decentralization; see e.g. Lacker and Ramanan [13], Carmona et al. [5], and Cardaliaguet and Rainer [3].
Mean field games, introduced simultaneously by Lasry and Lions [14, 15, 16], and Huang, Caines and Malhamé [12, 11] (see also the two-volume book of Carmona and Delarue [4] for an extensive overview), analyze games with a large number of players which are weakly interacting through their empirical distribution. The main appeal of the mean field games is the decentralized structure of their equilibria: agents compute their best response to a given population distribution, which is then determined by a fixed point problem. The best response calculation is a pure stochastic control problem. Instead of working with the Hamilton-Jacobi-Bellman equation, we perform the calculation using Schrödinger bridges which can be seen as the stochastic analogue of quadratic optimal transport. (See Léonard [18] and Chen et al. [7] for an overview of Schrödinger bridges and their connection to optimal transport.) We first introduce an auxiliary terminal distribution for the state (to convert the problem to an optimal transport problem), and then optimize over all such terminal distributions. This approach allows us to reformulate the best response problem as a static calculus of variation problem, which we then explicitly solve. This leads us to the next stage, the fixed point equation, whose solutions can be explicitly determined through its quantiles.
The distinguishing feature of our mean field game, i.e., tournament, is the rank-based feature of the reward. In particular, each player is rewarded according to the ranking of the terminal value of their project relative to the population, subject to cost of effort. This makes the analysis of the problem more difficult since the mean field interaction is non-local in the measure and the rank function is not regular. This problem was suggested by Guéant et al. [10] as a model in oil production, analyzed using abstract tools in the weak formulation by Carmona and Lacker [6] and in the strong formulation by Bayraktar and Zhang [2]. In these works continuity with respect to the rank was assumed. Related tournament games where the players are ranked according to their completion times have been considered by Bayraktar et al. [1] for controlled Brownian motion dynamics and by Nutz and Zhang [21] for one-stage Poisson dynamics with controlled jump intensity. In the Appendix we are going to construct an extension of Schrödinger bridges from space to time which can then be applied to construct the equilibrium in Bayraktar et al. [1] as well.
In economics there is a substantial literature on tournaments, going back to Lazear and Rosen [17]; see Bayraktar et al. [1] for a review. Most of these works focus on finitely many players or static models. Using such a one shot model, Fang et al. [8] analyze the discouraging effects of inequality. In our paper we observe a similar phenomenon, in that the more unequal the reward is (in the Lorenz order, see e.g. Marshall et al. [20]) the smaller the game value for each player. However, unlike in the work of Fang et al. [8], the same is not true for the effort in our set-up: the most fair distribution induces agents to put forth zero effort. Hence, one of the questions in the mechanism design section we investigate is what reward function maximizes the cumulative effort. We also analyze the case when agents have a social planner doing the optimization, which is used in computing the price of anarchy.
The rest of the paper is organized as follows: In Section 2 we consider the single player’s problem and find her best response using Schrödinger bridges. We then explicitly compute the mean field equilibrium in Section 3 and show the effect of reward inequality on the well-being of the players in Section 4. Section 5 is where we investigate the tournament design problems with respect to several criteria. In Section 6 we compute the price of anarchy. Finally, in Appendix A we show how one can adapt the Schrödinger bridge approach to the completion time ranking game of Bayraktar et al. [1].
2 A single player’s problem
Let us first describe the incentives of the player: We call a reward function if it is increasing in all of its arguments111Throughout the paper, increasing and decreasing are understood in the weak sense., -valued if , and satisfies for all . Denote the set of reward functions by and the set of bounded reward functions by .
Given the distribution of the terminal project value of the population, we wish to find the best response to for a representative player. For any , write for the cumulative distribution function (c.d.f.) of and for . A representative player with cost parameter solves the following stochastic control problem:
[TABLE]
Here is admissible if it is progressively measurable and satisfies . Different from Bayraktar and Zhang [2], let us consider the weak formulation of the above problem, which has some interesting connection with optimal transport.
Let be the Wiener space and be the Wiener measure starting at at . Under , the canonical process is a Brownian motion, and thus represents the project value process under zero effort. Let be the law of under , and be the filtration generated by the process .
For any such that , the Girsanov theorem implies that we can find an adapted process such that
[TABLE]
for some -Brownian motion . Conversely, given any sufficiently integrable adapted process , we can define such that the above equation holds. This means that if we restrict ourselves to sufficiently integrable effort process, we can identify with the set of laws of the controlled project value process . Moreover, let denote the relative entropy, with the convention that if is not absolutely continuous with respect to . We have
[TABLE]
Thus, we take the following as our definition of the single player’s control problem:
[TABLE]
Remark 2.1**.**
Here to keep notation simple, we define the filtration to be the one generated by the canonical process, but similar to Bayraktar et al. [1, Remark 2.1], all results remain valid if we take to be a larger filtration for which remains a Brownian motion.
2.1 Reduction via Schrödinger bridges
Let and as before; will serve as our reference measure.222In standard Schrödinger bridge problem, the reference measure is usually taken to be the stationary Wiener measure , but the disintegration argument works for any non-zero, non-negative, -finite measure on . For any , write for the time- marginal of , and for the joint distribution of at time [math] and . Given a source distribution and a target distribution , the Schrödinger bridge problem looks for an entropy-minimizing transport from to :
[TABLE]
It is known, by a simple disintegration, that the solution to the Schrödinger bridge problem is given by (see Föllmer [9] and also Léonard [18], Chen et al. [7])
[TABLE]
where is the law of a scaled Brownian bridge (scaled by ), and is the solution to the following static optimization, assuming it exists:
[TABLE]
In addition, it holds that . Since , the static problem (2.5) is trivial, giving a minimum entropy of .
Going back to our control problem (2.3), by splitting the optimization over to a maximization over its time- marginal plus a constrained entropy minimization, we can utilize the equivalence between (2.4) and (2.5) and obtain
[TABLE]
Since for any , the inequality is in fact an equality. Thus,
[TABLE]
Let be the standard normal probability density function (p.d.f.) and introduce
[TABLE]
We finally arrive at a constrained calculus of variation problem over the p.d.f. of :
[TABLE]
which can be easily solved by the method of Lagrange multipliers.333Alternatively, one can directly drop the integral constraint, by observing that any non-negative function can be normalized to have integral equal to one. The solution is provided below without proof. Once we find the optimal marginal , we can recover by
Proposition 2.1**.**
Given and . Let
[TABLE]
where is defined in (2.7). Suppose . Then the optimal terminal distribution of the single player has p.d.f.
[TABLE]
The optimal value is given by
Remark 2.2**.**
The Schrödinger bridge approach can also be adapted to the hitting time ranking game of Bayraktar et al. [1]. This calls for a variant of the Schrödinger bridge problem where the target distribution is not the time- marginal, but the law of first passage time of level zero. We detail this digression in the appendix for the interested readers.
Remark 2.3**.**
The model assumption that has a Gaussian -density is not essential in the best-response step; what matters is that the cost of effort can be written as a relative entropy where and are the laws of the state process corresponding to zero effort and general effort, respectively. Suppose the -distribution of is . Using the equivalence between the dynamic and static Schrödinger bridge problems, equations (2.6) and (2.8) hold with replaced by and replaced by :
[TABLE]
Using the method of Lagrange multipliers, one finds that the optimal is given by
[TABLE]
which is similar to (2.9). The derivation of the fixed point (see the proof of Theorem 3.2), on the other hand, relies on the existence of a density, but not on the Gaussian property.
2.2 Optimal effort
The Schrödinger bridge approach allows us to compute the optimal target distribution easily, which is all we need to analyze equilibrium measures (see Section 3 for details). On the other hand, to get a more explicit description of the optimal effort, ideally as a feedback function of time and state, we still need to go back to the dynamic control formulation of the Schrödinger bridge problem. We can utilize some existing results in, for example, Chen et al. [7].
Recall that under the reference measure , the canonical process is a scaled Brownian motion with transition density
[TABLE]
Note that . Define
[TABLE]
It can be easily checked that satisfy , ,
[TABLE]
By Chen et al. [7, p. 679-680], the optimal coupling has Markovian drift given by
[TABLE]
Using (2.9), we obtain
[TABLE]
Comparing with Bayraktar and Zhang [2, eq. (3.3)], we see that is precisely the Cole-Hopf transformation of the value function of the original control problem (2.1). Replacing by , we recover the same optimal Markovian control as Bayraktar and Zhang [2]:
[TABLE]
When is bounded, it is shown in Bayraktar and Zhang [2] that , meaning players show slackness when having a very big lead, and give up when falling far behind.
Remark 2.4**.**
For bounded rewards, Bayraktar and Zhang [2] also showed that the controlled diffusion in fact has a unique strong solution. From there, one can mimic the change of measure technique in Bayraktar et al. [1] to obtain the optimal terminal distribution (2.9). An advantage of the weak formulation, beside the connection to optimal transport theory, is that it avoids the hassle of having to verify the regularity of near .
3 Characterization of equilibrium
We say is an equilibrium (terminal distribution) if it is a fixed point of the best response mapping: , where is the optimal control for . By (2.9), we have the following characterization for general rewards functions.
Theorem 3.1**.**
Let and satisfy
[TABLE]
(The above condition always holds when .) Then is an equilibrium if and only if it has a strictly positive density satisfying
[TABLE]
The associated game value is given by .
Specializing to the subclass of reward functions
[TABLE]
we obtain a semi-explicit characterization.
Theorem 3.2**.**
Suppose . Then there exists at least one equilibrium. is an equilibrium terminal distribution of the project value if and only if its quantile function satisfies
[TABLE]
where is the standard normal c.d.f. and is a solution of
[TABLE]
The associated game value is given by
[TABLE]
Proof.
Since is bounded, we only need to look for solutions of the fixed point equation (3.1). Let be the c.d.f. of the random variable , i.e.
[TABLE]
Since any fixed point has a positive density, we can differentiate and use (3.1) to get
[TABLE]
Using and , we find that
[TABLE]
and
[TABLE]
It follows that
[TABLE]
from which we get (3.2). To determine , we integrate (3.2) from to and use that . This leads to equation (3.3). It remains to show that (3.3) has a solution.
Let be the right hand side of (3.3). We want to show has a fixed point. Since is bounded, it can be shown that where
[TABLE]
It follows that
[TABLE]
So the range of is contained in a compact interval. Moreover, is continuous on this interval since is assumed to be continuous in . By Brouwer’s fixed point theorem, has a fixed point. ∎
Remark 3.1**.**
Observe that the equilibrium distribution does not change if we add any bounded function to the reward. In other words, any bounded compensation that is solely based on the mean performance of the population does not really incentivize the players.
Remark 3.2**.**
When is further independent of (i.e. purely rank-based), the equilibrium is unique. In this case, the total effort of the population (or the expected cumulative effort of a representative player) is given by
[TABLE]
Remark 3.3**.**
If we confine ourselves to the subclass of equilibria which satisfy , then all results in this section can be restated with which is obtained from by dropping the boundedness requirement.
In the next two sections, we focus on bounded rewards that are purely rank-based:
[TABLE]
Each of these rewards induces a unique equilibrium, which facilitates the study of comparative statics and optimal reward design. In this case, we write for the unique game value.
4 Effect of reward inequality
Definition 4.1**.**
Given two reward functions , we say is more unequal than in Lorenz order (or majorizes ), written as , if and
[TABLE]
Theorem 4.1**.**
Suppose and , then the associated game values satisfy ; that is, reward inequality decreases the game value.
Proof.
First assume , where is the set of piecewise constant reward functions of the form
[TABLE]
In this case, the Lorenz order translates to and for all ; that is, majorizes . By Marshall et al. [20, Proposition 4.B.1], if and only if for all continuous convex functions . Take , we obtain
[TABLE]
which is equivalent to . This finishes the proof for piecewise constant reward functions.
For general , we approximate and by the Riemann sums and , respectively, where . Moreover, by the mean value theorem, can always be chosen to satisfy and for all . This ensures that the discretization preserves the Lorenz order. The result then follows from the previous step and passing to the limit. ∎
Remark 4.1**.**
The maximum game value is attained by the most equal reward function, namely, the uniform reward. This can also be directly seen from Jenssen’s inequality:
[TABLE]
with equality attained if and only if is constant. From another perspective, the expected reward in equilibrium is always equal to by symmetry, while the expected cost of effort is minimized to zero under the uniform reward, when nobody exerts any effort. Since uniform reward induces zero effort, the expected total effort clearly does not have the same monotonicity as the game value with respect to reward inequality (cf. Section 5.4).
5 Tournament design
Denote the mapping from to the unique equilibrium by
[TABLE]
From (3.2), we see that is translation invariant, i.e. for any constant . Let be the set of probability measures on that have strictly positive density. For , define the normalized density
[TABLE]
5.1 Realizing a target equilibrium distribution
Suppose the principal has in mind a target distribution of the terminal project value in equilibrium. He wants to know whether that is feasible via a purely rank-based reward, and if yes, how should he design the reward to achieve it? The following theorem completely characterizes the set of feasible equilibria and the reward functions that induce them.
Theorem 5.1**.**
- (i)
The set of equilibria attainable by a purely rank-based reward is given by
[TABLE]
- (ii)
If , then
[TABLE]
- (iii)
Suppose we impose additional reservation “utility” constraint and budget constraint , then the constant in (ii) is restricted to
[TABLE]
where
[TABLE]
In particular, such a exists if and only if
[TABLE]
Proof.
(i) From Theorem 3.1, we know that the normalized density of any equilibrium is increasing and log-bounded. Conversely, given any with such properties, it is easy to check that satisfies (3.2) with purely rank-based reward function :
[TABLE]
(ii) If is another function in that attains in equilibrium, then
[TABLE]
by (3.2). Differentiating both sides with respect to and setting , we obtain
[TABLE]
Since the left hand side is independent of , must be constant.
(iii) Let be a reward function realizing in equilibrium. By Theorem 3.2, the game value . Hence if and only if . We also have
[TABLE]
So if and only if . ∎
Theorem 5.1 allows us to convert many optimal reward design problems into problems about finding the optimal target equilibrium distribution. We gave three solvable examples below.
5.2 Maximizing rank- performance
Fix a number , a reservation utility and a budget . We look for a reward function which meets both the reservation utility requirement and the budget constraint, and which maximizes the -quantile of . Define the set of feasible reward functions by
[TABLE]
The optimization problem reads
[TABLE]
Theorem 5.2**.**
The optimal quantile is uniquely attained (up to a.e. equivalence) by the step function
[TABLE]
where is the unique solution in to the equation
[TABLE]
Let and be given by (2.7). We have
[TABLE]
and
[TABLE]
Proof.
By Theorem 5.1, if and only if and
[TABLE]
So we can equivalently formulate the optimization problem as one having as the decision variable, as the objective function, and and (5.1) as the constraints.
Maximizing is equivalent to maximizing
[TABLE]
For any feasible equilibrium distribution , let which implies , , and . In particular, the mapping from to is one-to-one. Further rewrite the optimization problem as
[TABLE]
where is also constrained to be positive, decreasing, bounded and bounded away from zero, as translated from . Each feasible clearly induces a feasible . Conversely, for any feasible , define . Then for some constant by Theorem 5.1. Together with the constraints in (5.2), we find that and that is feasible. Thus, the mapping from feasible to feasible is in fact bijective, which implies that it suffices for us to solve problem (5.2). Any optimal induces an optimal which can be realized by the reward function
[TABLE]
Here we have added the constant to to ensure that . The rest of the proof is devoted to solving the equivalent problem (5.2).
We first show that the constant given in the theorem statement is well-defined. Let
[TABLE]
It can be shown that is strictly decreasing on and strictly increasing on , hence has a global minimum at with . Moreover, as or . Since , by intermediate value theorem, the equation
[TABLE]
has at least one solution. When , is the unique solution. When , there are two solutions: one in and the other in . In both cases. is well-defined.
Next, we show that
[TABLE]
is the unique optimizer of problem (5.2). Since , it is clear that is decreasing, bounded and bounded away from zero. Straightforward calculation also shows that
[TABLE]
Therefore, satisfies all the feasibility constraints. Given any other feasible . We have, by repeated application of Jensen’s inequality, that
[TABLE]
That is
[TABLE]
We claim that . Suppose on the contrary that . Then since and is strictly increasing on , we must have , which is a contradiction. Thus, we have proved that is optimal. In fact, is the unique optimizer, since would imply all Jensen’s inequalities above are equalities. This holds if and only if is constant on and . We then use and to deduce that .
Finally, we argue that the optimal reward function induced by is also unique. Because of the bijection between and , we know that is the unique optimal equilibrium distribution. Note that . By Theorem 5.1,
[TABLE]
The remaining theorem statements follow from direct calculation. ∎
Remark 5.1**.**
*One can also replace the reservation utility constraint by the hard constraint: . Similar to Bayraktar et al. [1, Theorem 6.2], the optimal reward function in this case is the equal reward with cutoff rank , i.e. . *
5.3 Maximizing net profit
Suppose each terminal output generates a profit for the principal, where is a bounded increasing function. The goal is to find such that and the net profit
[TABLE]
is maximized.
Theorem 5.3**.**
The optimal net profit is given by
[TABLE]
and is uniquely attained by
[TABLE]
where is given by (2.7), and
[TABLE]
Proof.
By Theorem 5.1, it suffices for us to look for the optimal which can then be realized by for any . It is clear that the principal should pick to minimize the cost. Write . We then have
[TABLE]
The optimization problem over is given by
[TABLE]
To solve problem (5.3), we define
[TABLE]
For each fixed , the integrand above attains its pointwise maximum at
[TABLE]
Clearly, since is bounded and increasing, so is . We then find by
[TABLE]
giving
[TABLE]
The formulas for and then follow. ∎
5.4 Maximizing total effort
Let be given. We look for a purely rank-based reward function which maximizes the total effort
[TABLE]
subject to the reservation utility constraint and the budget constraint .
Theorem 5.4**.**
. When , the unique optimal reward is given by . When , as , an -optimal reward is
[TABLE]
where
[TABLE]
Proof.
By Theorem 5.1, it suffices for us to look for an optimal target distribution satisfying
[TABLE]
Such a , if lies in , can be realized by the reward function . We shall assume that we are in the nontrivial case , otherwise the only attainable equilibrium is which is induced by the uniform reward. We first relax the boundedness requirement of ; it turns out that the the relaxed optimizer fails to be in . We then construct an approximate optimizer by truncation.
The relaxed optimization problem over reads
[TABLE]
Any candidate optimizer necessarily satisfies the Kuhn–Tucker conditions (see e.g. Luenberger [19])
[TABLE]
The above implies
[TABLE]
and
[TABLE]
where is determined by the complementary slackness
[TABLE]
giving . We then have
[TABLE]
In other words, . It is also clear that is increasing. Since the objective and the equality constraints are linear in , and the inequality constraint is convex in , it can also be shown that these conditions, together with the monotonicity of , are sufficient for optimality. The relaxed optimal value equals .
Since is unbounded, such a . Consider the truncated defined in the theorem statement. We have and for all . Moreover, let
[TABLE]
We can show that
[TABLE]
and
[TABLE]
as . It follows that
[TABLE]
∎
Remark 5.2**.**
It can be verified using Theorem 3.1 that is an equilibrium induced by the unbounded reward
[TABLE]
The optimal effort process associated with is constant: , by straightforward calculation using (2.10). This can also be seen by directly substituting
[TABLE]
into the control problem, yielding a linear-quadratic optimization:
[TABLE]
However, it is not clear whether is the unique equilibrium under .
6 Price of anarchy
For a fixed reward function , the price of anarchy (PoA) is defined as the ratio between the optimal centralized welfare and the worst equilibrium welfare/game value. By centralized, we mean that the principal can prescribe and enforce the effort, or equivalently, the law of the controlled process, for the agents. We only consider a symmetric effort prescription, i.e. same terminal law for all players. To avoid triviality, we consider that is not purely rank-based, otherwise the optimal centralized welfare is always equal to which is attained by prescribing zero effort for all.
The optimal centralized welfare is defined as
[TABLE]
This is a control problem of McKean-Vlasov type. Similar to the derivation of (2.6), we can reformulate the centralized problem as
[TABLE]
When is independent of individual performance , the inner optimization over is explicitly solvable. Specifically, letting , we have
[TABLE]
Here and in the sequel, we omit the underlying assumption that and . Using the Lagrange method, we find that the mean-constrained entropy minimization has optimal value , attained by the normal distribution . It follows that
[TABLE]
We see that if has sub-quadratic growth. As one would expect for a symmetric game, the centralized solution does not depend on the rank-order allocation of rewards.
When is independent of rank , we have
[TABLE]
Again by the Lagrange method, we find that the inner maximization over has solution
[TABLE]
where is determined by
[TABLE]
Plugging in into the formula for , we get
[TABLE]
Example 6.1**.**
Suppose with . Note that takes the same form as the effort-maximizing reward in Remark 5.2 with replaced by and by . In this case, we have and by (6.1),
[TABLE]
To compute the equilibrium welfare, observe that given with . Let , then is an equilibrium for if and only if is optimal for , which means is also the unique equilibrium for the purely rank-based reward . This allows us to directly use Remark 5.2 to write down one equilibrium (not necessarily unique), whose mean satisfies
[TABLE]
The unique solution is given by
[TABLE]
with associated game value . It follows that in this case,
[TABLE]
When , PoA. When , PoA. When or , PoA. If we only consider equilibria that satisfy , then all inequalities become equalities.
Example 6.2**.**
Suppose where is bounded increasing, and . In this case, it can be shown that , , and
[TABLE]
By (6.2), we get
[TABLE]
Since is independent of rank and linear in , we have that for all . By Theorem 3.1, all equilibria are characterized by (3.1). We find that there is a unique equilibrium which is normal with mean and variance . The associated game value is
[TABLE]
It follows that, after rearranging the denominator,
[TABLE]
When , PoA=. When , the game is non-interactive, and PoA.
Appendix A Schrödinger bridges from space to time
Let be the canonical space and be the Wiener measure starting at at time zero. Also let be the filtration generated by the canonical process. Define with the convention that . Given a reference measure , a source distribution and a target distribution where , consider the following variant of the Schrödinger bridge problem:
[TABLE]
For any , define . We have the disintegration:
[TABLE]
Similar to the standard Schrödinger bridge problem, one can show that the optimal transport plan is given by
[TABLE]
where is the solution to
[TABLE]
assuming the infimum is attained.
Now, consider the hitting time ranking game of Bayraktar et al. [1], where each agent solves
[TABLE]
Here it is assumed that for all . Take and , and identify with the set of laws
[TABLE]
The condition on the Radon-Nikodym derivative means for all . Let be the law of the first passage time of level of a Brownian motion. We can rewrite the agent’s control problem in weak formulation as
[TABLE]
Note that for each , the associated optimal is always equivalent to and satisfies , where . It follows that and the inequality is in fact an equality. The resulting static problem can be further split into a constrained calculus of variation problem, followed by a static optimization:
[TABLE]
By elementary calculation, one deduces the same formula as Bayraktar et al. [1, Eq. (2.5)].
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Erhan Bayraktar, Jakša Cvitanić, and Yuchong Zhang. Large tournament games. Ann. Appl. Probab. , 29(6):3695–3744, 2019.
- 2[2] Erhan Bayraktar and Yuchong Zhang. A rank-based mean field game in the strong formulation. Electron. Commun. Probab. , 21:Paper No. 72, 12, 2016.
- 3[3] Pierre Cardaliaguet and Catherine Rainer. On the (in)efficiency of MFG equilibria. SIAM J. Control Optim. , 57(4):2292–2314, 2019.
- 4[4] René Carmona and François Delarue. Probabilistic Theory of Mean Field Games with Applications I-II . Springer, 2018.
- 5[5] René Carmona, Christy V. Graves, and Zongjun Tan. Price of anarchy for mean field games. ESAIM: Proc S , 65:349–383, 2019.
- 6[6] René Carmona and Daniel Lacker. A probabilistic weak formulation of mean field games and applications. Ann. Appl. Probab. , 25(3):1189–1231, 2015.
- 7[7] Yongxin Chen, Tryphon T. Georgiou, and Michele Pavon. On the relation between optimal transport and Schrödinger bridges: a stochastic control viewpoint. J. Optim. Theory Appl. , 169(2):671–691, 2016.
- 8[8] Dawei Fang, Thomas Noe, and Philipp Strack. Turning up the heat: The discouraging effect of competition in contests. Journal of Political Economy , 128(5):1940–1975, 2020.
