On Passivity, Reinforcement Learning and Higher-Order Learning in Multi-Agent Finite Games
Bolin Gao, Lacra Pavel

TL;DR
This paper introduces a passivity-based approach to analyze and design reinforcement learning algorithms in multi-agent finite games, ensuring convergence and improving speed, especially in cases where first-order methods fail.
Contribution
It develops a passivity-based framework for reinforcement learning in multi-agent games, including higher-order schemes that enhance convergence and speed.
Findings
Convergence to Nash distribution in monotone games
Higher-order schemes improve convergence speed
Methods can converge where first-order algorithms fail
Abstract
In this paper, we propose a passivity-based methodology for analysis and design of reinforcement learning in multi-agent finite games. Starting from a known exponentially-discounted reinforcement learning scheme, we show that convergence to a Nash distribution can be shown in the class of games characterized by the monotonicity property of their (negative) payoff. We further exploit passivity to propose a class of higher-order schemes that preserve convergence properties, can improve the speed of convergence and can even converge in cases whereby their first-order counterpart fail to converge. We demonstrate these properties through numerical simulations for several representative games.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
