All You Need Is Supervised Learning: From Imitation Learning to Meta-RL   With Upside Down RL

Kai Arulkumaran; Dylan R. Ashley; J\"urgen Schmidhuber; Rupesh K.; Srivastava

arXiv:2202.11960·cs.LG·February 25, 2022

All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RL

Kai Arulkumaran, Dylan R. Ashley, J\"urgen Schmidhuber, Rupesh K., Srivastava

PDF

Open Access 1 Repo

TL;DR

This paper introduces Upside Down Reinforcement Learning (UDRL), a supervised learning approach that unifies various RL paradigms, including imitation learning, offline RL, goal-conditioned RL, and meta-RL, using a single algorithm.

Contribution

The paper demonstrates that UDRL, previously used in online RL, can be extended to multiple RL settings with a general architecture, simplifying the learning process.

Findings

01

UDRL works effectively in imitation learning and offline RL.

02

A single UDRL agent can learn across multiple RL paradigms.

03

UDRL bypasses issues like bootstrapping and off-policy corrections.

Abstract

Upside down reinforcement learning (UDRL) flips the conventional use of the return in the objective function in RL upside down, by taking returns as input and predicting actions. UDRL is based purely on supervised learning, and bypasses some prominent issues in RL: bootstrapping, off-policy corrections, and discount factors. While previous work with UDRL demonstrated it in a traditional online RL setting, here we show that this single algorithm can also work in the imitation learning and offline RL settings, be extended to the goal-conditioned RL setting, and even the meta-RL setting. With a general agent architecture, a single UDRL agent can learn across all paradigms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kaixhin/gudrl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuction Theory and Applications · Reinforcement Learning in Robotics