
TL;DR
This paper introduces probabilistic methods to enhance the Dreamer world model, enabling parallel exploration of latent states and better handling of mutually exclusive futures, leading to improved performance in reinforcement learning tasks.
Contribution
The work extends the Dreamer model with probabilistic techniques that allow for multiple hypotheses and parallel exploration, improving sample efficiency and robustness.
Findings
Outperforms standard Dreamer with 4.5% score improvement
Achieves 28% lower variance in episode returns
Demonstrates benefits of probabilistic approaches in world models
Abstract
"Dreaming" enables agents to learn from imagined experiences, enabling more robust and sample-efficient learning of world models. In this work, we consider innovations to the state-of-the-art Dreamer model using probabilistic methods that enable: (1) the parallel exploration of many latent states; and (2) maintaining distinct hypotheses for mutually exclusive futures while retaining the desirable gradient properties of continuous latents. Evaluating on the MPE SimpleTag domain, our method outperforms standard Dreamer with a 4.5% score improvement and 28% lower variance in episode returns. We also discuss limitations and directions for future work, including how optimal hyperparameters (e.g. particle count K) scale with environmental complexity, and methods to capture epistemic uncertainty in world models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Machine Learning in Healthcare · Gaussian Processes and Bayesian Inference
