Game-theoretical control with continuous action sets
Steven Perkins, Panayotis Mertikopoulos, David S. Leslie

TL;DR
This paper introduces a convergent actor-critic reinforcement learning algorithm for potential games with continuous actions, extending finite-dimensional theories to infinite-dimensional spaces for distributed control.
Contribution
It develops an infinite-dimensional mean-field analysis and proves convergence of a new learning algorithm in continuous action potential games.
Findings
The algorithm converges to equilibrium in potential games with continuous controls.
The analysis extends stochastic approximation theory to infinite-dimensional Banach spaces.
Players do not need to track other agents' controls during learning.
Abstract
Motivated by the recent applications of game-theoretical learning techniques to the design of distributed control systems, we study a class of control problems that can be formulated as potential games with continuous action sets, and we propose an actor-critic reinforcement learning algorithm that provably converges to equilibrium in this class of problems. The method employed is to analyse the learning process under study through a mean-field dynamical system that evolves in an infinite-dimensional function space (the space of probability distributions over the players' continuous controls). To do so, we extend the theory of finite-dimensional two-timescale stochastic approximation to an infinite-dimensional, Banach space setting, and we prove that the continuous dynamics of the process converge to equilibrium in the case of potential games. These results combine to give a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Game Theory and Applications
