Robust optimal policies for team Markov games
Feng Huang, Ming Cao, and Long Wang

TL;DR
This paper introduces a robust framework for team Markov games that enhances policy stability under uncertain parameters, providing a faster converging algorithm with practical approximation methods.
Contribution
It extends team Markov games to uncertain environments, proposing a robust iterative learning algorithm with proven convergence and improved efficiency.
Findings
The algorithm converges faster than robust dynamic programming.
It effectively handles incomplete information scenarios.
Numerical simulations demonstrate improved robustness in uncertain social dilemmas.
Abstract
In stochastic dynamic environments, team Markov games have emerged as a versatile paradigm for studying sequential decision-making problems of fully cooperative multi-agent systems. However, the optimality of the derived policies is usually sensitive to model parameters, which are typically unknown and required to be estimated from noisy data in practice. To mitigate the sensitivity of optimal policies to these uncertain parameters, we propose a robust model of team Markov games in this paper, where agents utilize robust optimization approaches to update strategies. This model extends team Markov games to the scenario of incomplete information and meanwhile provides an alternative solution concept of robust team optimality. To seek such a solution, we develop a robust iterative learning algorithm of team policies and prove its convergence. This algorithm, compared with robust dynamic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGame Theory and Applications · Reinforcement Learning in Robotics · Experimental Behavioral Economics Studies
