Hieros: Hierarchical Imagination on Structured State Space Sequence World Models
Paul Mattes, Rainer Schlosser, Ralf Herbrich

TL;DR
Hieros introduces a hierarchical world model using S5 layers for efficient, multi-scale imagination in reinforcement learning, significantly improving sample efficiency, exploration, and prediction accuracy in Atari benchmarks.
Contribution
This paper presents Hieros, a novel hierarchical policy with S5-based world modeling that enhances imagination accuracy and efficiency over existing RNN and Transformer methods.
Findings
Outperforms state-of-the-art in Atari 100k benchmark
Predicts complex dynamics with high accuracy
Displays superior exploration capabilities
Abstract
One of the biggest challenges to modern deep reinforcement learning (DRL) algorithms is sample efficiency. Many approaches learn a world model in order to train an agent entirely in imagination, eliminating the need for direct environment interaction during training. However, these methods often suffer from either a lack of imagination accuracy, exploration capabilities, or runtime efficiency. We propose Hieros, a hierarchical policy that learns time abstracted world representations and imagines trajectories at multiple time scales in latent space. Hieros uses an S5 layer-based world model, which predicts next world states in parallel during training and iteratively during environment interaction. Due to the special properties of S5 layers, our method can train in parallel and predict next world states iteratively during imagination. This allows for more efficient training than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Machine Learning and Data Classification
