Approachability in Stackelberg Stochastic Games with Vector Costs
Dileep Kalathil, Vivek Borkar, Rahul Jain

TL;DR
This paper extends Blackwell's approachability concept to Stackelberg stochastic games with vector costs, providing a practical strategy and a reinforcement learning algorithm for unknown environments, with theoretical guarantees.
Contribution
It introduces a simple, computationally feasible approachability strategy and a reinforcement learning method for Stackelberg stochastic games with vector costs, including theoretical characterizations.
Findings
Proposes a tractable approachability strategy for Stackelberg stochastic games.
Develops a reinforcement learning algorithm for unknown transition kernels.
Provides a complete characterization of approachability conditions for convex sets.
Abstract
The notion of approachability was introduced by Blackwell [1] in the context of vector-valued repeated games. The famous Blackwell's approachability theorem prescribes a strategy for approachability, i.e., for `steering' the average cost of a given agent towards a given target set, irrespective of the strategies of the other agents. In this paper, motivated by the multi-objective optimization/decision making problems in dynamically changing environments, we address the approachability problem in Stackelberg stochastic games with vector valued cost functions. We make two main contributions. Firstly, we give a simple and computationally tractable strategy for approachability for Stackelberg stochastic games along the lines of Blackwell's. Secondly, we give a reinforcement learning algorithm for learning the approachable strategy when the transition kernel is unknown. We also recover as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Economic theories and models
