NoRML: No-Reward Meta Learning

Yuxiang Yang; Ken Caluwaerts; Atil Iscen; Jie Tan; Chelsea Finn

arXiv:1903.01063·cs.LG·March 5, 2019·20 cites

NoRML: No-Reward Meta Learning

Yuxiang Yang, Ken Caluwaerts, Atil Iscen, Jie Tan, Chelsea Finn

PDF

Open Access 1 Repo

TL;DR

NoRML introduces a meta-learning approach that enables reinforcement learning agents to adapt to new environments using only observable dynamics, without relying on explicit reward signals, outperforming traditional methods like MAML in dynamic scenarios.

Contribution

It extends MAML for RL to operate without reward signals by leveraging environment dynamics, with a more expressive update step and targeted exploration capabilities.

Findings

01

NoRML outperforms MAML in environments with changing dynamics.

02

The method effectively adapts without explicit reward feedback.

03

Validated on synthetic and benchmark environments.

Abstract

Efficiently adapting to new environments and changes in dynamics is critical for agents to successfully operate in the real world. Reinforcement learning (RL) based approaches typically rely on external reward feedback for adaptation. However, in many scenarios this reward signal might not be readily available for the target task, or the difference between the environments can be implicit and only observable from the dynamics. To this end, we introduce a method that allows for self-adaptation of learned policies: No-Reward Meta Learning (NoRML). NoRML extends Model Agnostic Meta Learning (MAML) for RL and uses observable dynamics of the environment instead of an explicit reward function in MAML's finetune step. Our method has a more expressive update step than MAML, while maintaining MAML's gradient based foundation. Additionally, in order to allow more targeted exploration, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-research/google-research
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Fuel Cells and Related Materials · Data Stream Mining Techniques

MethodsModel-Agnostic Meta-Learning