The impact of intrinsic rewards on exploration in Reinforcement Learning
Aya Kayal, Eduardo Pignatelli, Laura Toni

TL;DR
This paper investigates how different levels of intrinsic rewards influence exploration strategies in Reinforcement Learning, revealing that State Count excels with low-dimensional observations, while Maximum Entropy offers more robustness in RGB environments.
Contribution
It provides an empirical comparison of four intrinsic rewards at various diversity levels, clarifying their impact on exploration in MiniGrid environments.
Findings
State Count performs best with low-dimensional observations.
Maximum Entropy offers more robust exploration with RGB observations.
Learning diverse skills with DIAYN does not necessarily enhance exploration.
Abstract
One of the open challenges in Reinforcement Learning is the hard exploration problem in sparse reward environments. Various types of intrinsic rewards have been proposed to address this challenge by pushing towards diversity. This diversity might be imposed at different levels, favouring the agent to explore different states, policies or behaviours (State, Policy and Skill level diversity, respectively). However, the impact of diversity on the agent's behaviour remains unclear. In this work, we aim to fill this gap by studying the effect of different levels of diversity imposed by intrinsic rewards on the exploration patterns of RL agents. We select four intrinsic rewards (State Count, Intrinsic Curiosity Module (ICM), Maximum Entropy, and Diversity is all you need (DIAYN)), each pushing for a different diversity level. We conduct an empirical study on MiniGrid environment to compare…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Innovation Diffusion and Forecasting
