Learning to Communicate Using Counterfactual Reasoning

Simon Vanneste; Astrid Vanneste; Kevin Mets; Tom De Schepper; Ali; Anwar; Siegfried Mercelis; Steven Latr\'e; Peter Hellinckx

arXiv:2006.07200·cs.LG·April 27, 2022·6 cites

Learning to Communicate Using Counterfactual Reasoning

Simon Vanneste, Astrid Vanneste, Kevin Mets, Tom De Schepper, Ali, Anwar, Siegfried Mercelis, Steven Latr\'e, Peter Hellinckx

PDF

Open Access

TL;DR

This paper presents MACC, a novel multi-agent reinforcement learning method that uses counterfactual reasoning to improve communication protocols by addressing credit assignment, non-stationarity, and influenceability challenges.

Contribution

MACC introduces a new approach combining counterfactual reasoning, a specialized communication Q-function, and a social loss to enhance multi-agent communication learning.

Findings

01

MACC outperforms state-of-the-art baselines in four Particle environment scenarios.

02

The method effectively addresses credit assignment and non-stationarity issues.

03

Influenceable agents are successfully learned using the social loss function.

Abstract

Learning to communicate in order to share state information is an active problem in the area of multi-agent reinforcement learning (MARL). The credit assignment problem, the non-stationarity of the communication environment and the creation of influenceable agents are major challenges within this research field which need to be overcome in order to learn a valid communication protocol. This paper introduces the novel multi-agent counterfactual communication learning (MACC) method which adapts counterfactual reasoning in order to overcome the credit assignment problem for communicating agents. Secondly, the non-stationarity of the communication environment while learning the communication Q-function is overcome by creating the communication Q-function using the action policy of the other agents and the Q-function of the action environment. Additionally, a social loss function is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)