Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings
John D. Co-Reyes, YuXuan Liu, Abhishek Gupta, Benjamin Eysenbach,, Pieter Abbeel, Sergey Levine

TL;DR
This paper introduces SeCTAR, a hierarchical reinforcement learning model that learns trajectory embeddings through a variational autoencoder framework, enabling effective long-horizon planning and exploration in complex tasks.
Contribution
The paper presents a novel trajectory autoencoder that learns consistent latent-conditioned policies and models, facilitating hierarchical RL with improved planning and exploration capabilities.
Findings
Outperforms standard RL methods on long-horizon tasks
Effective in sparse reward environments
Enables hierarchical reasoning and model-based planning
Abstract
In this work, we take a representation learning perspective on hierarchical reinforcement learning, where the problem of learning lower layers in a hierarchy is transformed into the problem of learning trajectory-level generative models. We show that we can learn continuous latent representations of trajectories, which are effective in solving temporally extended and multi-stage problems. Our proposed model, SeCTAR, draws inspiration from variational autoencoders, and learns latent representations of trajectories. A key component of this method is to learn both a latent-conditioned policy and a latent-conditioned model which are consistent with each other. Given the same latent, the policy generates a trajectory which should match the trajectory predicted by the model. This model provides a built-in prediction mechanism, by predicting the outcome of closed loop policy behavior. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Autonomous Vehicle Technology and Safety · Anomaly Detection Techniques and Applications
