Maximum Entropy Model-based Reinforcement Learning
Oleg Svidchenko, Aleksei Shpilman

TL;DR
This paper introduces a novel exploration method tailored for model-based reinforcement learning, significantly enhancing the sample efficiency and performance of the Dreamer algorithm in complex control tasks.
Contribution
The paper presents a new exploration technique specifically designed for model-based RL, addressing the gap in sample efficiency improvements for such algorithms.
Findings
The proposed method improves Dreamer's performance in high-dimensional tasks.
Experimental results show increased sample efficiency and better exploration.
The approach outperforms existing exploration strategies in model-based RL.
Abstract
Recent advances in reinforcement learning have demonstrated its ability to solve hard agent-environment interaction tasks on a super-human level. However, the application of reinforcement learning methods to practical and real-world tasks is currently limited due to most RL state-of-art algorithms' sample inefficiency, i.e., the need for a vast number of training episodes. For example, OpenAI Five algorithm that has beaten human players in Dota 2 has trained for thousands of years of game time. Several approaches exist that tackle the issue of sample inefficiency, that either offers a more efficient usage of already gathered experience or aim to gain a more relevant and diverse experience via a better exploration of an environment. However, to our knowledge, no such approach exists for model-based algorithms, that showed their high sample efficiency in solving hard control tasks with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Advanced Bandit Algorithms Research
