CAMEO: Curiosity Augmented Metropolis for Exploratory Optimal Policies
Simo Alami.C, Fernando Llorente, Rim Kaddah, Luca Martino and, Jesse Read

TL;DR
This paper introduces CAMEO, a curiosity-augmented Metropolis algorithm that samples diverse optimal policies in reinforcement learning, enabling better coverage of behaviors and risk profiles in control tasks.
Contribution
The paper proposes a novel method, CAMEO, to sample and analyze the distribution of optimal policies, enhancing understanding of policy diversity in reinforcement learning.
Findings
CAMEO successfully samples diverse optimal policies in classic control problems.
The method handles environments with sparse rewards effectively.
Sampled policies exhibit different risk profiles, useful for interpretability.
Abstract
Reinforcement Learning has drawn huge interest as a tool for solving optimal control problems. Solving a given problem (task or environment) involves converging towards an optimal policy. However, there might exist multiple optimal policies that can dramatically differ in their behaviour; for example, some may be faster than the others but at the expense of greater risk. We consider and study a distribution of optimal policies. We design a curiosity-augmented Metropolis algorithm (CAMEO), such that we can sample optimal policies, and such that these policies effectively adopt diverse behaviours, since this implies greater coverage of the different possible optimal policies. In experimental simulations we show that CAMEO indeed obtains policies that all solve classic control problems, and even in the challenging case of environments that provide sparse rewards. We further show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Explainable Artificial Intelligence (XAI)
