Multi-Agent Reinforcement Learning via Adaptive Kalman Temporal   Difference and Successor Representation

Mohammad Salimibeni; Arash Mohammadi; Parvin Malekzadeh; and; Konstantinos N. Plataniotis

arXiv:2112.15156·cs.LG·January 3, 2022

Multi-Agent Reinforcement Learning via Adaptive Kalman Temporal Difference and Successor Representation

Mohammad Salimibeni, Arash Mohammadi, Parvin Malekzadeh, and, Konstantinos N. Plataniotis

PDF

Open Access

TL;DR

This paper introduces MAK-TD and MAK-SR, innovative multi-agent reinforcement learning frameworks that utilize adaptive Kalman filtering and successor representations to improve learning efficiency and robustness in complex environments.

Contribution

It proposes the MAK-TD and MAK-SR frameworks that incorporate Kalman filtering and adaptive estimation for better handling of uncertainty in multi-agent RL with continuous action spaces.

Findings

01

MAK-TD and MAK-SR outperform traditional methods in OpenAI Gym benchmarks.

02

The frameworks effectively model uncertainty and improve sample efficiency.

03

They demonstrate robustness to parameter variations and environment complexities.

Abstract

Distributed Multi-Agent Reinforcement Learning (MARL) algorithms has attracted a surge of interest lately mainly due to the recent advancements of Deep Neural Networks (DNNs). Conventional Model-Based (MB) or Model-Free (MF) RL algorithms are not directly applicable to the MARL problems due to utilization of a fixed reward model for learning the underlying value function. While DNN-based solutions perform utterly well when a single agent is involved, such methods fail to fully generalize to the complexities of MARL problems. In other words, although recently developed approaches based on DNNs for multi-agent environments have achieved superior performance, they are still prone to overfiting, high sensitivity to parameter selection, and sample inefficiency. The paper proposes the Multi-Agent Adaptive Kalman Temporal Difference (MAK-TD) framework and its Successor Representation-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Target Tracking and Data Fusion in Sensor Networks · Advanced Multi-Objective Optimization Algorithms