Dyn-O: Building Structured World Models with Object-Centric Representations

Zizhao Wang; Kaixin Wang; Li Zhao; Peter Stone; and Jiang Bian

arXiv:2507.03298·cs.LG·July 8, 2025

Dyn-O: Building Structured World Models with Object-Centric Representations

Zizhao Wang, Kaixin Wang, Li Zhao, Peter Stone, and Jiang Bian

PDF

1 Video

TL;DR

Dyn-O introduces an advanced object-centric world model capable of handling complex, cluttered environments directly from pixels, outperforming previous models like DreamerV3 in prediction accuracy and enabling detailed feature manipulation.

Contribution

The paper presents Dyn-O, a novel structured world model that improves object-centric representations and dynamics modeling in complex visual environments, enhancing generalization and manipulation capabilities.

Findings

01

Outperforms DreamerV3 in Procgen prediction tasks

02

Handles complex, cluttered scenes from pixel data

03

Enables detailed manipulation of object features

Abstract

World models aim to capture the dynamics of the environment, enabling agents to predict and plan for future states. In most scenarios of interest, the dynamics are highly centered on interactions among objects within the environment. This motivates the development of world models that operate on object-centric rather than monolithic representations, with the goal of more effectively capturing environment dynamics and enhancing compositional generalization. However, the development of object-centric world models has largely been explored in environments with limited visual complexity (such as basic geometries). It remains underexplored whether such models can generalize to more complex settings with diverse textures and cluttered scenes. In this paper, we fill this gap by introducing Dyn-O, an enhanced structured world model built upon object-centric representations. Compared to prior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Dyn-O: Building Structured World Models with Object-Centric Representations· slideslive