Residual Q-Networks for Value Function Factorizing in Multi-Agent   Reinforcement Learning

Rafael Pina; Varuna De Silva; Joosep Hook; and Ahmet Kondoz

arXiv:2205.15245·cs.LG·May 31, 2022

Residual Q-Networks for Value Function Factorizing in Multi-Agent Reinforcement Learning

Rafael Pina, Varuna De Silva, Joosep Hook, and Ahmet Kondoz

PDF

Open Access

TL;DR

This paper introduces Residual Q-Networks (RQNs) for multi-agent reinforcement learning, enhancing stability and convergence speed in cooperative tasks by transforming individual Q-values while maintaining the IGM criterion.

Contribution

The paper proposes Residual Q-Networks as a novel auxiliary network to improve factorization stability and convergence in multi-agent reinforcement learning.

Findings

01

RQNs outperform state-of-the-art methods in convergence speed.

02

RQNs demonstrate increased stability across diverse environments.

03

Performance gains are notable in environments with severe punishments and partial observability.

Abstract

Multi-Agent Reinforcement Learning (MARL) is useful in many problems that require the cooperation and coordination of multiple agents. Learning optimal policies using reinforcement learning in a multi-agent setting can be very difficult as the number of agents increases. Recent solutions such as Value Decomposition Networks (VDN), QMIX, QTRAN and QPLEX adhere to the centralized training and decentralized execution scheme and perform factorization of the joint action-value functions. However, these methods still suffer from increased environmental complexity, and at times fail to converge in a stable manner. We propose a novel concept of Residual Q-Networks (RQNs) for MARL, which learns to transform the individual Q-value trajectories in a way that preserves the Individual-Global-Max criteria (IGM), but is more robust in factorizing action-value functions. The RQN acts as an auxiliary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Neurobiology and Insect Physiology Research