Simplified Temporal Consistency Reinforcement Learning
Yi Zhao, Wenshuai Zhao, Rinu Boney, Juho Kannala, Joni Pajarinen

TL;DR
This paper demonstrates that a simple latent dynamics model trained with temporal consistency can significantly improve reinforcement learning efficiency, matching or surpassing more complex methods in high-dimensional tasks.
Contribution
Introducing a straightforward representation learning approach using latent temporal consistency that enhances RL performance without complex auxiliary objectives.
Findings
Achieves high-quality dynamics modeling for complex tasks.
Speeds up training by 4.1 times compared to ensemble methods.
Outperforms other model-free methods and matches model-based sample efficiency.
Abstract
Reinforcement learning is able to solve complex sequential decision-making tasks but is currently limited by sample efficiency and required computation. To improve sample efficiency, recent work focuses on model-based RL which interleaves model learning with planning. Recent methods further utilize policy learning, value estimation, and, self-supervised learning as auxiliary objectives. In this paper we show that, surprisingly, a simple representation learning approach relying only on a latent dynamics model trained by latent temporal consistency is sufficient for high-performance RL. This applies when using pure planning with a dynamics model conditioned on the representation, but, also when utilizing the representation as policy and value function features in model-free RL. In experiments, our approach learns an accurate dynamics model to solve challenging high-dimensional locomotion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI)
