Transfer RL via the Undo Maps Formalism

Abhi Gupta; Ted Moskovitz; David Alvarez-Melis; Aldo Pacchiano

arXiv:2211.14469·cs.LG·November 29, 2022

Transfer RL via the Undo Maps Formalism

Abhi Gupta, Ted Moskovitz, David Alvarez-Melis, Aldo Pacchiano

PDF

Open Access

TL;DR

This paper introduces TvD, a novel transfer reinforcement learning framework that uses distribution matching and optimal transport to adapt policies across environments with different state spaces, without modifying the original policies.

Contribution

The paper proposes a data-centric transfer method using optimal transport to learn environment transformations, enabling policy transfer without domain-specific assumptions.

Findings

01

Successful transfer across environment transformations in gridworlds

02

Effective distribution matching via optimal transport

03

Policy transfer without modifying original policies

Abstract

Transferring knowledge across domains is one of the most fundamental problems in machine learning, but doing so effectively in the context of reinforcement learning remains largely an open problem. Current methods make strong assumptions on the specifics of the task, often lack principled objectives, and -- crucially -- modify individual policies, which might be sub-optimal when the domains differ due to a drift in the state space, i.e., it is intrinsic to the environment and therefore affects every agent interacting with it. To address these drawbacks, we propose TvD: transfer via distribution matching, a framework to transfer knowledge across interactive domains. We approach the problem from a data-centric perspective, characterizing the discrepancy in environments by means of (potentially complex) transformation between their state spaces, and thus posing the problem of transfer as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Topic Modeling