Maximum a posteriori learning in demand competition games

Mohsen Rakhshan

arXiv:1611.10270·cs.GT·December 1, 2016

Maximum a posteriori learning in demand competition games

Mohsen Rakhshan

PDF

Open Access

TL;DR

This paper demonstrates that in a competitive inventory game, players can learn to play the Nash equilibrium through repeated observations and MAP estimation, even without knowing opponents' actions or utilities.

Contribution

It proves that players can learn the Nash policy and converge to equilibrium using MAP estimation based solely on their own sales observations.

Findings

01

Players' actions converge to Nash equilibrium.

02

Beliefs about opponents' strategies also converge.

03

MAP estimation enables learning without full information.

Abstract

We consider an inventory competition game between two firms. The question we address is this: If players do not know the opponent's action and opponent's utility function can they learn to play the Nash policy in a repeated game by observing their own sales? In this work it is proven that by means of Maximum A Posteriori (MAP) estimation, players can learn the Nash policy. It is proven that players' actions and beliefs do converge to the Nash equilibrium.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGame Theory and Applications · Experimental Behavioral Economics Studies · Economic theories and models