Stability of Dining Clubs in the Kolkata Paise Problem with and without Cheating
Akshat Harlalka, Andrew Belmonte, Christopher Griffin

TL;DR
This paper studies the stability of dining clubs in the Kolkata Paise Restaurant problem, analyzing how cooperation, taxation, and cheating influence agents' strategies and system dynamics through theoretical and numerical methods.
Contribution
It introduces dining clubs and an evolutionary game framework, analyzes stability with and without cheating, and explores bifurcations and dynamics of cheater populations.
Findings
Dining clubs are evolutionarily stable strategies.
Cheating introduces unstable fixed points and bifurcations.
Numerical simulations reveal complex dynamics with multiple clubs.
Abstract
We introduce the idea of a dining club to the Kolkata Paise Restaurant Problem. In this problem, agents choose (randomly) among restaurants, but if multiple agents choose the same restaurant, only one will eat. Agents in the dining club will coordinate their restaurant choice to avoid choice collision and increase their probability of eating. We model the problem of deciding whether to join the dining club as an evolutionary game and show that the strategy of joining the dining club is evolutionarily stable. We then introduce an optimized member tax to those individuals in the dining club, which is used to provide a safety net for those group members who don't eat because of collision with a non-dining club member. When non-dining club members are allowed to cheat and share communal food within the dining club, we show that a new unstable fixed point emerges in the dynamics. A…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEconomic theories and models · Evolutionary Game Theory and Cooperation · Game Theory and Applications
Stability of Dining Clubs in the Kolkata Paise Problem with and without Cheating
Akshat Harlalka
Department of Computer Science, Penn State University, University Park, PA 16802
Andrew Belmonte
Department of Mathematics & Huck Institute of Life Sciences, Penn State University, University Park, PA 16802
Christopher Griffin
Applied Research Laboratory, Penn State University, University Park, PA 16802
Abstract
We introduce the idea of a dining club to the Kolkata Paise Restaurant Problem. In this problem, agents choose (randomly) among restaurants, but if multiple agents choose the same restaurant, only one will eat. Agents in the dining club will coordinate their restaurant choice to avoid choice collision and increase their probability of eating. We model the problem of deciding whether to join the dining club as an evolutionary game and show that the strategy of joining the dining club is evolutionarily stable. We then introduce an optimized member tax to those individuals in the dining club, which is used to provide a safety net for those group members who don’t eat because of collision with a non-dining club member. When non-dining club members are allowed to cheat and share communal food within the dining club, we show that a new unstable fixed point emerges in the dynamics. A bifurcation analysis is performed in this case. To conclude our theoretical study, we then introduce evolutionary dynamics for the cheater population and study these dynamics. Numerical experiments illustrate the behaviour of the system with more than one dining club and show several potential areas for future research.
I Introduction
The Kolkata Paise Restaurant Problem (KPRP) was first introduced in 2007 [1] during work on the Kolkata Paise Hotel Problem. Since then, it has been studied extensively [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 1, 14, 15, 16, 17] in the econophysics literature. In its simplest form, we assume agents will choose among restaurants. Choice is governed by a distribution determined by an implicit ranking of the restaurants. The ranking represents the payoff of eating at a given restaurant. If two or more agents select the same restaurant, then the restaurant randomly chooses which agent to serve.
A broad overview of KPRP can be found in [3, 7, 11]. When all restaurants are ranked equally (i.e., have payoff ) and agents choose a restaurant at random, the expected payoff to each agent is easily seen to be approach as . Using stochastic strategies and resource utilization models, the mean payoff can be increased to [18]. Identifying strategies to improve on the uncoordinated outcome is a central problem in KPRP.
KPRP is an example of an anti-coordination game (such as Hawk-Dove) [19]. Other examples of this class of game are minority games [20, 21] and the El Farol bar problem [22, 23, 24, 25]. These types of games also emerge in models of channel sharing in communications systems [26, 27, 28].
Learning in KPRP is considered in [12, 18, 29] with both classical and quantum learning considered in [12]. Quantum versions of the problem are considered in [12, 15, 16] and its relevance to other areas of physical modelling are considered in [8, 17, 14, 10] with phase transitions considered recently in [2, 9]. Distributed and coordinated solutions to optimizing agent payoff are discussed in [4, 5, 6, 13].
In this paper, we use evolutionary game theory to study a group formation problem within the context of KPRP. We assume that some subset of the population of individuals forms a dining club. Individuals in the dining club coordinate their actions and will choose distinct restaurants from each other, thus increasing the odds that any individual within the dining club will eat. In this context, we show the following results:
When all restaurants are ranked equally, membership in the dining club is globally stable. That is, asymptotically all players join the dining club (in the limit as ). 2. 2.
When the dining club taxes its members by collecting food for redistribution to those members who did not eat, there is an optimal tax rate that ensures all members are equally well-fed. 3. 3.
When non-club members can choose to deceptively share in the communal food (freeload) of the dining club, a new unstable fixed point emerges. The fixed point corresponding to a population where all members join the dining club remains stable, but is no longer globally stable. We characterize the basin of attraction in this case. This effectively introduces a public goods game into the KPRP. 4. 4.
We then use numerical analysis to study the case where two dining clubs are active. We numerically illustrate the existence of equilibrium surfaces where multiple dining clubs can exist simultaneously along with non-group members as a result of group taxation (food sharing), cheating (freeloading), and cheating detection.
The remainder of this paper is organized as follows: In Section II, we analyse an evolutionary model of KPRP with a dining club. We study resource distribution through taxation and cheating in Section III. Cheating is modelled in an evolutionary context in Section IV. KPRP with multiple dynamic clubs is studied numerically in Section V. Finally, in Section VI we present conclusions and future directions.
II Mathematical Analysis
We first study KPRP with a single dining club. Let be the size of the dining club and let be the size of the free population with total population given by . The probability that an individual in the dining club eats is given by
[TABLE]
while the probability that a free individual eats is given by
[TABLE]
If we assume and sum over , then we can rewrite in closed form as
[TABLE]
Likewise, can be written as
[TABLE]
If we compute the limit as , this yields the asymptotic probabilities
[TABLE]
and
[TABLE]
For the remainder of this section and the next, we assume an infinite population. While it was easier to work with for the previous computation, for further analysis it is simpler to express as a fraction of the total population. Let
[TABLE]
Substituting
[TABLE]
into Eqs. 1 and 2 yields the simplified forms,
[TABLE]
A simple plot shows that for all .
Let be a random variable denoting the meal size for an individual in the dining club, and let be a random variable denoting the meal size for a randomly chosen member of the population. Then the probability of eating is now easily seen as the expected meal size , with a meal size of corresponding to eating and a meal size of [math] corresponding to not eating. Using this interpretation, and equating meal size with fitness, we assume the rate of growth of the dining club is given by
[TABLE]
From [30], it follows that the proportion must follow the replicator dynamic
[TABLE]
The population mean can be computed as
[TABLE]
and converted to an expression in using Eqs. 1, 2 and 3 as,
[TABLE]
Let
[TABLE]
be the growth rate of . Then and we see that . That is, Eq. 4 has two fixed points. From Fig. 1, we must have for . This is illustrated in Fig. 2 (left). It follows that is described by a non-logistic sigmoid, as shown in Fig. 2 (right).
We conclude that the decision to join the dining club is an evolutionarily stable strategy and the fixed point is globally asymptotically stable while the fixed point is asymptotically unstable.
III Social Safety Nets and Deceptive Free Loading
Suppose the dining club imposes a food tax on its members at the rate so that if a diner is successful in obtaining food, then he reserves of his meal to be shared with club members who choose a restaurant that is occupied by an independent individual. If we assume these resources are pooled and then shared equally, the expected meal size (normalized to the interval ) available for a club member who cannot obtain food on his own is given by
[TABLE]
Note that sharing (for any value of ) does not affect the expected meal size obtained by a group member, since we have the expected meal size
[TABLE]
We can construct a tax-rate that depends on and ensures all participants in the dining club receive the same meal size. Setting and solving, we obtain:
[TABLE]
Thus, as increases, the tax decreases. As a result of Eq. 6, the right-hand-side of Eq. 4 remains unchanged and the decision to join the dining club is still evolutionarily stable, even in the presence of sharing. That is is still globally asymptotically stable.
Suppose a proportion of the independent population that does not eat can deceptively pose as club members, thereby sharing in the communally available food. In the presence of a food tax, the resulting decision to join the dining club now becomes a public goods problem. Then the expected meal size to anyone receiving shared food is given by
[TABLE]
where is defined in terms of in Eq. 3. Let be the random variable denoting the expected meal size for an independent member of the population. Then as a function of and ,
[TABLE]
It is possible but unwieldy to compute using the expected meal size with deception rate and group size . Plotting sample curves for shows that the growth rate now changes sign at some value ; see Fig. 3 (left).
As a consequence of this, the replicator equation for is given by
[TABLE]
These dynamics exhibit a new unstable equilibrium point, illustrating a bifurcation in parameter with numerically computed bifurcation diagram shown in Fig. 3 (right). An example solution flow (for various initial conditions) is shown in Fig. 4.
We can compute for . This is particularly interesting because we have essentially constructed a public goods problem in which joining the dining club enforces a taxation rate of on the members, who are then guaranteed (the public good of) a meal each day. The presence of freeloaders destabilizes the group formation process, but does not guarantee that a group cannot form. Since is monotonically increasing, it follows that if grows slowly enough so that at any time , then the dining club will grow to include the entire population. If , then the dining club collapses. We impose an evolutionary dynamic on the freeloaders in the next section to study this effect.
IV Evolving Freeloaders
If we divide the population into three groups, dining club members (), non-dining club freeloaders () and non-dining club non-freeloaders (), we can construct an evolutionary dynamic for the freeloaders. Let be the proportion of the population that is not in the dining club and will freeload (cheating) and to be the proportion of the population that is not in the dining club and not freeloading (honest). Then the population of freeloaders is . The expected meal size to any agent accepting communal food is then
[TABLE]
Let be as before, and let be the random variable denoting the meal size for an individual in the freeloading group and be the random variable denoting meal size for an individual from the non-freeloading non-dining club group. It follows from Eqs. 8, 9 and 10 that
[TABLE]
Here, we have replaced with its definition in terms of and . Employing the same reasoning we used to obtain Eq. 4, we can construct replicator equations for proportions , and .
The population mean meal size is
[TABLE]
The dynamics of (the non-freeloading, non-dining club group) are extraneous, and we can focus on the two-dimensional system
[TABLE]
which do not depend on the value of .
Fig. 5 shows the dynamics of this evolutionary system. It is straightforward to compute that when , then for all values of . Thus, the dynamics freeze on the left boundary of the simplex
[TABLE]
There is a single hyperbolic saddle on the boundary of that can be numerically computed as . The two boundary equilibria and are both locally asymptotically stable. Thus, the long-run population behaviour is dependent on the initial conditions. We can numerically construct a curve of initial conditions showing this dichotomous behaviour. This is shown in Fig. 6 and as the red curve in Fig. 5.
As approaches corresponding to equilibrium point for , the curve stops because would need to lie outside the simplex to cause the dining club to collapse. It is interesting to note that the phase portrait illustrates trajectories in which both and are increasing up to a point, followed by either the collapse of the dining club (while continues to increase) or the collapse of the freeloading group, as all population members join the dining club (and continues to increase).
V Numerical Results on Multiple Dining Clubs
We now consider KPRP with two dining clubs. We model three groups of agents , and for free agents, dining club one and dining club two respectively. We estimate , and using Monte Carlo simulation. This Monte Carlo simulation is then embedded into a larger dynamic process for updating the groups.
In the Monte Carlo simulation, the free agent group acts normally, choosing a restaurant randomly. The members of the dining clubs also chose restaurants randomly, but with the constraint that no two agents in a dining club may choose the same restaurant. Since we are studying this system numerically, we introduce two kinds of taxation policies.
Policy I: We assume a given tax rate with no redistribution; i.e., the tax goes to maintain the dining club in some form. 2. 2.
Policy II: Agents within the dining club are taxed at a rate given by, Eq. 7 and food is redistributed to club members who do not eat (and possibly freeloaders).
Agents in the free market will randomly choose a dining club to eat in if they do not get food on a given day with probability . That is, we assume . We also introduce a probability that cheaters will be caught. If a cheater gets caught, their food is not distributed and becomes waste.
In the dynamic model that follows, we refer to the process of simulating groups eating over several days by the function . The system dynamics of our simulation are then described by the following steps:
1:Input: , , .
2:while There is at least one agent in each group do
3: Compute .
4: Set .
5: Choose two agents and at random from .
6: Let (resp. ) be the group to which (resp. ) belongs.
7: Let (resp. ) be the probability that (resp. ) eats.
8: if then
9: Move to
10: else if then
11: Move to
12: end if
13: Remove and from .
14: if then
15: goto 5
16: else
17: goto 3
18: end if
19:end while
It is clear in the dynamics simulated by this model, there are three equilibria corresponding to the cases when all agents are in or or .
V.1 Simulation Results
For each simulation, we divide 100 agents into , and . To construct an approximation for the basins of attraction for three equilibrium populations, we ran the simulation using 1000 replications simulation and every possible (discrete) starting condition on , and .
Tax Policy I:
We explore the effect of varying from to . To manage simulation time, we executed the while loop at most, 10000 times. If all players had not joined a single community by then, we declared this a failed run, suggesting slow convergence from this initial condition. The outcome of almost all experiments resulted in a dominant group (either free agents or dinning clubs) being formed. This is illustrated in Fig. 7.
Let and be the proportion of the population in dining clubs one and two, respectively, and let be the free group proportion. Then the dynamics can be projected to the two-dimensional unit simplex embedded in with coordinates . When the simulation converges, can determine the -limit set of trajectories leaving (near) an initial condition . Fig. 7 shows that the size of the tax rate is correlated with the size of the basin of attraction for the free agent group. The dynamics roughly partition the simplex into three basins of attraction, with the basins of attraction for the two dining clubs exhibiting symmetry as expected. On the boundaries of these regions, we expect unstable coexistence of multiple groups would be possible. This is qualitatively similar to the unstable fixed point identified in Fig. 4.
Tax Policy II:
In a second set of experiments, we let vary between [math] and and used Eq. 7 to set the tax policy. The cheating probability was fixed at . As before, we executed the while loop at most, 10000 times. If all players had not joined a single community by then, we declared this a failed run, suggesting slow convergence from this initial condition. Basins of attraction for various fixed points are shown in Fig. 8.
It is interesting to notice that there are a substantial number of failed cases between the clubs. This suggests an area of slow dynamics and possibly the existence of a slow manifold. Constructing a mathematical model of this scenario is an area reserved for future work, since it is unclear exactly how the dynamics are changing in this region.
VI Conclusions and Future Directions
In this paper, we studied the Kolkata Paise Restaurant Problem (KPRP) with dining clubs. Agents in a dining club mutually agree to visit separate restaurants, thereby increasing the probability that they eat (obtain a resource). An evolutionary game model was formulated describing the choice to join the dining club. We showed that joining the dining club is an evolutionarily stable strategy, even when members are taxed (in food) and resources are distributed. When cheating was introduced to the non-dining club members, i.e. the non-dining club members could deceptively benefit from the communal food collected by the dining club, a new unstable fixed point appears. We analysed this bifurcation as well as the decision to cheat using the resulting replicator dynamic. Numeric experiments on two dining clubs show that the behaviour in this case is similar to the case with one dining club, but may exhibit richer dynamics.
There are several directions for future research. Studying the theoretical properties of two (or more) dining clubs is clearly of interest. Adding many groups (i.e., so that the number of groups is a proportion of the number of players) might lead to unexpected phenomena. Also, allowing groups to compete for membership (by varying tax rates) might create interesting dynamics. As part of this research, investigation of the dynamics on the boundary both in theory and through numeric simulation would be of interest. A final area of future research would be to investigate the effect of taxing cheaters who are caught, thus allowing them to eat, but discouraging them from cheating. Determining the impact on the basins of attraction in this case would be the primary research objective.
Acknowledgements
A.H., A.B., and C.G. were supported in part by the National Science Foundation under grant DMS-1814876.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] B. K. Chakrabarti, M. Mitra, A.-S. Chakrabarti, The kolkata paise hotel problem, ar Xiv preprint ar Xiv:0711.1639 (2007).
- 2[2] S. Biswas, A. Ghosh, A. Chatterjee, T. Naskar, B. K. Chakrabarti, Continuous transition of social efficiencies in the stochastic-strategy minority game, Physical Review E 85 (3) (2012) 031104.
- 3[3] B. K. Chakrabarti, A. Chatterjee, A. Ghosh, S. Mukherjee, B. Tamir, et al., Econophysics of the Kolkata Restaurant problem and related games, Springer, 2017.
- 4[4] A. S. Chakrabarti, D. Ghosh, Emergence of anti-coordination through reinforcement learning in generalized minority games, Journal of Economic Interaction and Coordination 14 (2019) 225–245.
- 5[5] D. Ghosh, A. S. Chakrabarti, Emergence of distributed coordination in the kolkata paise restaurant problem with finite information, Physica A: Statistical Mechanics and its Applications 483 (2017) 16–24.
- 6[6] D. Dhar, V. Sasidevan, B. K. Chakrabarti, Emergent cooperation amongst competing agents in minority games, Physica A: Statistical Mechanics and its Applications 390 (20) (2011) 3477–3485.
- 7[7] P. Banerjee, M. Mitra, C. Mukherjee, Kolkata paise restaurant problem and the cyclically fair norm, Econophysics of Systemic Risk and Network Dynamics (2013) 201–216.
- 8[8] S. Biswas, A. K. Mandal, Parallel minority game and it’s application in movement optimization during an epidemic, Physica A: Statistical Mechanics and its Applications 561 (2021) 125271.
