Composition of Memory Experts for Diffusion World Models

Sebastian Stapf; Pablo Acuaviva Huertos; Aram Davtyan; Paolo Favaro

arXiv:2605.18813·cs.LG·May 20, 2026

Composition of Memory Experts for Diffusion World Models

Sebastian Stapf, Pablo Acuaviva Huertos, Aram Davtyan, Paolo Favaro

PDF

1 Video

TL;DR

This paper introduces a diffusion-based framework that combines specialized memory experts to improve long-term, spatial, and local memory in world models, overcoming traditional trade-offs and scaling efficiently.

Contribution

It proposes a novel compositional approach integrating heterogeneous memory experts via a contrastive product-of-experts framework for diffusion world models.

Findings

01

Enhanced temporal consistency in benchmarks

02

Improved recall of past observations

03

Better navigation performance

Abstract

World models aim to predict plausible futures consistent with past observations, a capability central to planning and decision-making in reinforcement learning. Yet, existing architectures face a fundamental memory trade-off: transformers preserve local detail but are bottlenecked by quadratic attention, while recurrent and state-space models scale more efficiently but compress history at the cost of fidelity. To overcome this trade-off, we suggest decoupling future-past consistency from any single architecture and instead leveraging a set of specialized experts. We introduce a diffusion-based framework that integrates heterogeneous memory models through a contrastive product-of-experts formulation. Our approach instantiates three complementary roles: a short-term memory expert that captures fine local dynamics, a long-term memory expert that stores episodic history in external…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Composition of Memory Experts for Diffusion World Models· slideslive