The Effectiveness of World Models for Continual Reinforcement Learning
Samuel Kessler, Mateusz Ostaszewski, Micha{\l} Bortkiewicz, Mateusz, \.Zarski, Maciej Wo{\l}czyk, Jack Parker-Holder, Stephen J. Roberts, Piotr, Mi{\l}o\'s

TL;DR
This paper demonstrates that world models can be effectively adapted for continual reinforcement learning, improving performance and transfer in changing environments through selective experience replay and new modeling strategies.
Contribution
It introduces Continual-Dreamer, a task-agnostic world model approach for continual RL that outperforms existing methods on benchmark tasks.
Findings
Continual-Dreamer is sample efficient.
It outperforms state-of-the-art continual RL methods.
Selective experience replay enhances performance and transfer.
Abstract
World models power some of the most efficient reinforcement learning algorithms. In this work, we showcase that they can be harnessed for continual learning - a situation when the agent faces changing environments. World models typically employ a replay buffer for training, which can be naturally extended to continual learning. We systematically study how different selective experience replay methods affect performance, forgetting, and transfer. We also provide recommendations regarding various modeling options for using world models. The best set of choices is called Continual-Dreamer, it is task-agnostic and utilizes the world model for continual exploration. Continual-Dreamer is sample efficient and outperforms state-of-the-art task-agnostic continual reinforcement learning methods on Minigrid and Minihack benchmarks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Advanced Bandit Algorithms Research
MethodsExperience Replay
