Ensemble Reinforcement Learning in Continuous Spaces -- A Hierarchical Multi-Step Approach for Policy Training
Gang Chen, Victoria Huang

TL;DR
This paper introduces a hierarchical ensemble reinforcement learning method with multi-step training that enhances exploration, stability, and performance in continuous control tasks by promoting collaboration among base learners.
Contribution
It presents a novel multi-step integration training technique for ensemble DRL, enabling effective inter-learner collaboration and improved learning stability.
Findings
Outperforms state-of-the-art DRL algorithms on benchmark tasks
Enhances exploration and stability in continuous control environments
Theoretically verified hierarchical ensemble learning algorithm
Abstract
Actor-critic deep reinforcement learning (DRL) algorithms have recently achieved prominent success in tackling various challenging reinforcement learning (RL) problems, particularly complex control tasks with high-dimensional continuous state and action spaces. Nevertheless, existing research showed that actor-critic DRL algorithms often failed to explore their learning environments effectively, resulting in limited learning stability and performance. To address this limitation, several ensemble DRL algorithms have been proposed lately to boost exploration and stabilize the learning process. However, most of existing ensemble algorithms do not explicitly train all base learners towards jointly optimizing the performance of the ensemble. In this paper, we propose a new technique to train an ensemble of base learners based on an innovative multi-step integration method. This training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Neural Networks and Reservoir Computing
MethodsBalanced Selection
