A Geometric Perspective on Self-Supervised Policy Adaptation

Cristian Bodnar; Karol Hausman; Gabriel Dulac-Arnold; Rico; Jonschkowski

arXiv:2011.07318·cs.LG·November 17, 2020

A Geometric Perspective on Self-Supervised Policy Adaptation

Cristian Bodnar, Karol Hausman, Gabriel Dulac-Arnold, Rico, Jonschkowski

PDF

Open Access

TL;DR

This paper introduces a geometric framework for long-term self-supervised policy adaptation in reinforcement learning, addressing real-world distractions and improving generalization through manifold manipulation.

Contribution

It presents a novel geometric perspective on self-supervised RL adaptation, analyzing embedding space dynamics and proposing methods to enhance long-term policy robustness.

Findings

01

Embedding space processes can negatively impact performance and can be mitigated.

02

Manipulating the geometry of actor and critic manifolds improves generalization.

03

Theoretical insights into actor-based and actor-free agent adaptation.

Abstract

One of the most challenging aspects of real-world reinforcement learning (RL) is the multitude of unpredictable and ever-changing distractions that could divert an agent from what was tasked to do in its training environment. While an agent could learn from reward signals to ignore them, the complexity of the real-world can make rewards hard to acquire, or, at best, extremely sparse. A recent class of self-supervised methods have shown promise that reward-free adaptation under challenging distractions is possible. However, previous work focused on a short one-episode adaptation setting. In this paper, we consider a long-term adaptation setup that is more akin to the specifics of the real-world and propose a geometric perspective on self-supervised adaptation. We empirically describe the processes that take place in the embedding space during this adaptation process, reveal some of its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Causal Inference Techniques · Economic Policies and Impacts