MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven   Reinforcement Learning

Kevin Li; Abhishek Gupta; Ashwin Reddy; Vitchyr Pong; Aurick Zhou,; Justin Yu; Sergey Levine

arXiv:2107.07184·cs.LG·July 20, 2021·6 cites

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

Kevin Li, Abhishek Gupta, Ashwin Reddy, Vitchyr Pong, Aurick Zhou,, Justin Yu, Sergey Levine

PDF

Open Access 1 Video

TL;DR

This paper introduces MURAL, a meta-learning approach that creates uncertainty-aware classifiers to improve exploration and goal guidance in outcome-driven reinforcement learning, especially in complex navigation and robotic tasks.

Contribution

The paper proposes a novel meta-learning method for computing normalized maximum likelihood classifiers that enhance exploration and reward shaping in reinforcement learning.

Findings

01

Successfully solves challenging navigation tasks

02

Outperforms prior methods in robotic manipulation

03

Provides effective goal-directed exploration

Abstract

Exploration in reinforcement learning is a challenging problem: in the worst case, the agent must search for high-reward states that could be hidden anywhere in the state space. Can we define a more tractable class of RL problems, where the agent is provided with examples of successful outcomes? In this problem setting, the reward function can be obtained automatically by training a classifier to categorize states as successful or not. If trained properly, such a classifier can provide a well-shaped objective landscape that both promotes progress toward good states and provides a calibrated exploration bonus. In this work, we show that an uncertainty aware classifier can solve challenging reinforcement learning problems by both encouraging exploration and provided directed guidance towards positive outcomes. We propose a novel mechanism for obtaining these calibrated, uncertainty-aware…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Data Classification · Explainable Artificial Intelligence (XAI)

MethodsAttentive Walk-Aggregating Graph Neural Network