Accounting for the Sequential Nature of States to Learn Features for   Reinforcement Learning

Nathan Michlo; Devon Jarvis; Richard Klein; Steven James

arXiv:2205.06000·cs.LG·May 13, 2022

Accounting for the Sequential Nature of States to Learn Features for Reinforcement Learning

Nathan Michlo, Devon Jarvis, Richard Klein, Steven James

PDF

Open Access

TL;DR

This paper addresses the challenge of learning useful state representations in reinforcement learning environments with non-overlapping states by leveraging the sequential nature of states to improve variational autoencoder performance.

Contribution

The authors introduce a method that uses the sequential order of states to enable VAEs with triplet loss to learn effective features without extra supervision.

Findings

01

VAEs fail in environments with non-overlapping states.

02

Sequential state information can be used to approximate a distance metric.

03

Modified VAEs with triplet loss learn useful features in challenging environments.

Abstract

In this work, we investigate the properties of data that cause popular representation learning approaches to fail. In particular, we find that in environments where states do not significantly overlap, variational autoencoders (VAEs) fail to learn useful features. We demonstrate this failure in a simple gridworld domain, and then provide a solution in the form of metric learning. However, metric learning requires supervision in the form of a distance function, which is absent in reinforcement learning. To overcome this, we leverage the sequential nature of states in a replay buffer to approximate a distance metric and provide a weak supervision signal, under the assumption that temporally close states are also semantically similar. We modify a VAE with triplet loss and demonstrate that this approach is able to learn useful features for downstream tasks, without additional supervision,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Generative Adversarial Networks and Image Synthesis · Reinforcement Learning in Robotics

MethodsTriplet Loss