TL;DR
Simulus is a modular world model agent that combines multiple improvements, achieving state-of-the-art sample efficiency across diverse benchmarks in sample-efficient reinforcement learning.
Contribution
It introduces a flexible token-based framework integrating several enhancements, demonstrating their combined effectiveness in improving sample efficiency.
Findings
Achieves state-of-the-art sample efficiency on Atari, DMC Proprioception, and Craftax benchmarks.
Intrinsic motivation benefits sample efficiency even with limited interactions.
Each component individually contributes to overall performance, with synergistic effects when combined.
Abstract
World models (WMs) represent the frontier of sample-efficient reinforcement learning, but their complexity leaves many promising improvements unrealized due to the significant expertise and effort required to identify and integrate them. Inspired by Rainbow, which showed that individually known improvements to DQN complement each other and can be effectively combined, we take on this challenge and ask whether the same principle applies to world model agents. We introduce Simulus, a modular token-based WM agent that integrates: (1) a flexible tokenization framework supporting arbitrary combinations of observation and action modalities; (2) intrinsic motivation for epistemic uncertainty reduction; (3) prioritized world model replay; and (4) regression-as-classification for reward and return prediction. Simulus achieves state-of-the-art sample efficiency for planning-free WMs across three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
