TL;DR
This paper introduces a diffusion-based framework that combines specialized memory experts to improve long-term, spatial, and local memory in world models, overcoming traditional trade-offs and scaling efficiently.
Contribution
It proposes a novel compositional approach integrating heterogeneous memory experts via a contrastive product-of-experts framework for diffusion world models.
Findings
Enhanced temporal consistency in benchmarks
Improved recall of past observations
Better navigation performance
Abstract
World models aim to predict plausible futures consistent with past observations, a capability central to planning and decision-making in reinforcement learning. Yet, existing architectures face a fundamental memory trade-off: transformers preserve local detail but are bottlenecked by quadratic attention, while recurrent and state-space models scale more efficiently but compress history at the cost of fidelity. To overcome this trade-off, we suggest decoupling future-past consistency from any single architecture and instead leveraging a set of specialized experts. We introduce a diffusion-based framework that integrates heterogeneous memory models through a contrastive product-of-experts formulation. Our approach instantiates three complementary roles: a short-term memory expert that captures fine local dynamics, a long-term memory expert that stores episodic history in external…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
