Approximated Multi-Agent Fitted Q Iteration
Antoine Lesage-Landry, Duncan S. Callaway

TL;DR
This paper introduces AMAFQI, an efficient multi-agent batch reinforcement learning method that significantly reduces computation time compared to traditional FQI while maintaining similar performance levels.
Contribution
The paper presents a novel approximation method for multi-agent FQI that scales linearly with the number of agents, improving tractability in multi-agent reinforcement learning.
Findings
AMAFQI reduces computation time compared to FQI.
AMAFQI achieves similar performance to FQI.
The approach scales linearly with the number of agents.
Abstract
We formulate an efficient approximation for multi-agent batch reinforcement learning, the approximated multi-agent fitted Q iteration (AMAFQI). We present a detailed derivation of our approach. We propose an iterative policy search and show that it yields a greedy policy with respect to multiple approximations of the centralized, learned Q-function. In each iteration and policy evaluation, AMAFQI requires a number of computations that scales linearly with the number of agents whereas the analogous number of computations increase exponentially for the fitted Q iteration (FQI), a commonly used approaches in batch reinforcement learning. This property of AMAFQI is fundamental for the design of a tractable multi-agent approach. We evaluate the performance of AMAFQI and compare it to FQI in numerical simulations. The simulations illustrate the significant computation time reduction when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications
