Visual Transfer for Reinforcement Learning via Wasserstein Domain   Confusion

Josh Roy; George Konidaris

arXiv:2006.03465·cs.LG·June 8, 2020

Visual Transfer for Reinforcement Learning via Wasserstein Domain Confusion

Josh Roy, George Konidaris

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper presents WAPPO, a new reinforcement learning algorithm that uses Wasserstein distance to align feature distributions for effective visual transfer across different tasks, outperforming previous methods.

Contribution

WAPPO introduces a Wasserstein-based adversarial approach for visual transfer in reinforcement learning, explicitly aligning feature distributions between source and target tasks.

Findings

01

WAPPO outperforms previous state-of-the-art in visual transfer tasks.

02

Successfully transfers policies across Visual Cartpole and Procgen environments.

03

Effective alignment of feature distributions improves transfer performance.

Abstract

We introduce Wasserstein Adversarial Proximal Policy Optimization (WAPPO), a novel algorithm for visual transfer in Reinforcement Learning that explicitly learns to align the distributions of extracted features between a source and target task. WAPPO approximates and minimizes the Wasserstein-1 distance between the distributions of features from source and target domains via a novel Wasserstein Confusion objective. WAPPO outperforms the prior state-of-the-art in visual transfer and successfully transfers policies across Visual Cartpole and two instantiations of 16 OpenAI Procgen environments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ku2482/wappo.pytorch
pytorch

Videos

Visual Transfer for Reinforcement Learning via Wasserstein Domain Confusion· underline

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Cell Image Analysis Techniques