DSDF: An approach to handle stochastic agents in collaborative multi-agent reinforcement learning
Satheesh K. Perepu, Kaushik Dey

TL;DR
This paper introduces DSDF, a method that adjusts for stochasticity in multi-agent reinforcement learning, improving coordination and convergence when agents have varying levels of reliability or malfunction.
Contribution
The paper proposes DSDF, a novel approach that tunes the discount factor based on agent uncertainty, enhancing coordination among stochastic and deterministic agents.
Findings
DSDF improves convergence in stochastic multi-agent settings.
The method enhances coordination reliability among agents.
Results outperform existing approaches on benchmark environments.
Abstract
Multi-Agent reinforcement learning has received lot of attention in recent years and have applications in many different areas. Existing methods involving Centralized Training and Decentralized execution, attempts to train the agents towards learning a pattern of coordinated actions to arrive at optimal joint policy. However if some agents are stochastic to varying degrees of stochasticity, the above methods often fail to converge and provides poor coordination among agents. In this paper we show how this stochasticity of agents, which could be a result of malfunction or aging of robots, can add to the uncertainty in coordination and there contribute to unsatisfactory global coordination. In this case, the deterministic agents have to understand the behavior and limitations of the stochastic agents while arriving at optimal joint policy. Our solution, DSDF which tunes the discounted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Advanced Multi-Objective Optimization Algorithms
