Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Beining Han; Chongyi Zheng; Harris Chan; Keiran Paster; Michael R.; Zhang; Jimmy Ba

arXiv:2110.14248·cs.LG·October 29, 2021

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Beining Han, Chongyi Zheng, Harris Chan, Keiran Paster, Michael R., Zhang, Jimmy Ba

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper addresses the challenge of domain generalization in goal-conditioned deep RL, proposing a theoretical framework and a new method PA-SkewFit to improve robustness against environmental changes, with significant empirical gains.

Contribution

It introduces a theoretical framework for goal-conditioned RL in Block MDPs and proposes PA-SkewFit, a practical method to enhance domain invariance and generalization.

Findings

01

PA-SkewFit improves performance by 50% over baselines.

02

The framework characterizes policy generalizability in new environments.

03

Empirical results demonstrate robustness in unseen test environments.

Abstract

Deep Reinforcement Learning (RL) is successful in solving many complex Markov Decision Processes (MDPs) problems. However, agents often face unanticipated environmental changes after deployment in the real world. These changes are often spurious and unrelated to the underlying problem, such as background shifts for visual input agents. Unfortunately, deep RL policies are usually sensitive to these changes and fail to act robustly against them. This resembles the problem of domain generalization in supervised learning. In this work, we study this problem for goal-conditioned RL agents. We propose a theoretical framework in the Block MDP setting that characterizes the generalizability of goal-conditioned policies to new environments. Under this framework, we develop a practical method PA-SkewFit that enhances domain generalization. The empirical evaluation shows that our goal-conditioned…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/icp-block-mdp
tfOfficial

Videos

Learning Domain Invariant Representations in Goal-conditioned Block MDPs· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Reinforcement Learning in Robotics

MethodsTest