MAME : Model-Agnostic Meta-Exploration
Swaminathan Gurumurthy, Sumit Kumar, Katia Sycara

TL;DR
This paper introduces MAME, a model-agnostic meta-exploration framework that employs separate exploration and exploitation policies to improve adaptation efficiency in meta-reinforcement learning tasks.
Contribution
It proposes explicitly modeling separate exploration and exploitation policies, enhancing training flexibility and adaptation efficiency compared to prior methods.
Findings
Superior performance over prior methods in meta-reinforcement learning tasks
Effective use of self-supervised or supervised objectives for adaptation
More efficient inner-loop updates with separate exploration policy
Abstract
Meta-Reinforcement learning approaches aim to develop learning procedures that can adapt quickly to a distribution of tasks with the help of a few examples. Developing efficient exploration strategies capable of finding the most useful samples becomes critical in such settings. Existing approaches towards finding efficient exploration strategies add auxiliary objectives to promote exploration by the pre-update policy, however, this makes the adaptation using a few gradient steps difficult as the pre-update (exploration) and post-update (exploitation) policies are often quite different. Instead, we propose to explicitly model a separate exploration policy for the task distribution. Having two different policies gives more flexibility in training the exploration policy and also makes adaptation to any specific task easier. We show that using self-supervised or supervised learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Software Testing and Debugging Techniques · Simulation Techniques and Applications
