Black Box Meta-Learning Intrinsic Rewards
Octavio Pappalardo, Rodrigo Ramele, Juan Miguel Santos

TL;DR
This paper proposes a novel meta-learning approach to automatically learn intrinsic rewards for reinforcement learning agents, improving data efficiency and exploration in sparse-reward environments without relying on meta-gradient computations.
Contribution
It introduces a black-box meta-learning method for intrinsic reward optimization that bypasses traditional meta-gradient calculations, enhancing RL training in complex tasks.
Findings
Meta-learned intrinsic rewards outperform extrinsic rewards in sparse environments.
The approach is effective across continuous control tasks with parametric and non-parametric variations.
Sparse rewards during evaluation demonstrate the robustness of the learned intrinsic rewards.
Abstract
The broader application of reinforcement learning (RL) is limited by challenges including data efficiency, generalization capability, and ability to learn in sparse-reward environments. Meta-learning has emerged as a promising approach to address these issues by optimizing components of the learning algorithm to meet desired characteristics. Additionally, a different line of work has extensively studied the use of intrinsic rewards to enhance the exploration capabilities of algorithms. This work investigates how meta-learning can improve the training signal received by RL agents. We introduce a method to learn intrinsic rewards within a reinforcement learning framework that bypasses the typical computation of meta-gradients through an optimization process by treating policy updates as black boxes. We validate our approach against training with extrinsic rewards, demonstrating its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare
MethodsFocus
