Learning Domain Invariant Representations in Goal-conditioned Block MDPs
Beining Han, Chongyi Zheng, Harris Chan, Keiran Paster, Michael R., Zhang, Jimmy Ba

TL;DR
This paper addresses the challenge of domain generalization in goal-conditioned deep RL, proposing a theoretical framework and a new method PA-SkewFit to improve robustness against environmental changes, with significant empirical gains.
Contribution
It introduces a theoretical framework for goal-conditioned RL in Block MDPs and proposes PA-SkewFit, a practical method to enhance domain invariance and generalization.
Findings
PA-SkewFit improves performance by 50% over baselines.
The framework characterizes policy generalizability in new environments.
Empirical results demonstrate robustness in unseen test environments.
Abstract
Deep Reinforcement Learning (RL) is successful in solving many complex Markov Decision Processes (MDPs) problems. However, agents often face unanticipated environmental changes after deployment in the real world. These changes are often spurious and unrelated to the underlying problem, such as background shifts for visual input agents. Unfortunately, deep RL policies are usually sensitive to these changes and fail to act robustly against them. This resembles the problem of domain generalization in supervised learning. In this work, we study this problem for goal-conditioned RL agents. We propose a theoretical framework in the Block MDP setting that characterizes the generalizability of goal-conditioned policies to new environments. Under this framework, we develop a practical method PA-SkewFit that enhances domain generalization. The empirical evaluation shows that our goal-conditioned…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Reinforcement Learning in Robotics
MethodsTest
