Non-local Policy Optimization via Diversity-regularized Collaborative Exploration
Zhenghao Peng, Hao Sun, Bolei Zhou

TL;DR
This paper introduces DiCE, a multi-agent reinforcement learning framework that enhances exploration through diversity regularization, leading to improved performance in continuous control tasks.
Contribution
The paper proposes a novel multi-agent exploration method with diversity regularization, enabling more effective environment exploration and better policy learning.
Findings
DiCE outperforms baseline algorithms in MuJoCo tasks.
Diversity regularization improves exploration efficiency.
Multi-agent collaboration enhances policy optimization.
Abstract
Conventional Reinforcement Learning (RL) algorithms usually have one single agent learning to solve the task independently. As a result, the agent can only explore a limited part of the state-action space while the learned behavior is highly correlated to the agent's previous experience, making the training prone to a local minimum. In this work, we empower RL with the capability of teamwork and propose a novel non-local policy optimization framework called Diversity-regularized Collaborative Exploration (DiCE). DiCE utilizes a group of heterogeneous agents to explore the environment simultaneously and share the collected experiences. A regularization mechanism is further designed to maintain the diversity of the team and modulate the exploration. We implement the framework in both on-policy and off-policy settings and the experimental results show that DiCE can achieve substantial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robotic Locomotion and Control · Innovative Human-Technology Interaction
