Bounded Exploration with World Model Uncertainty in Soft Actor-Critic Reinforcement Learning Algorithm
Ting Qiao, Henry Williams, David Valencia, Bruce MacDonald

TL;DR
This paper introduces bounded exploration, a novel method combining 'soft' and intrinsic motivation exploration, significantly enhancing the performance and convergence speed of the Soft Actor-Critic algorithm in reinforcement learning tasks.
Contribution
It presents a new exploration technique that improves DRL efficiency and performance, especially when reward functions are strictly defined.
Findings
Achieved highest scores in 6 out of 8 experiments
Significantly improved convergence speed of SAC and its model-based extension
Enhanced exploration efficiency in complex environments
Abstract
One of the bottlenecks preventing Deep Reinforcement Learning algorithms (DRL) from real-world applications is how to explore the environment and collect informative transitions efficiently. The present paper describes bounded exploration, a novel exploration method that integrates both 'soft' and intrinsic motivation exploration. Bounded exploration notably improved the Soft Actor-Critic algorithm's performance and its model-based extension's converging speed. It achieved the highest score in 6 out of 8 experiments. Bounded exploration presents an alternative method to introduce intrinsic motivations to exploration when the original reward function has strict meanings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
