Learning Interactive World Model for Object-Centric Reinforcement Learning
Fan Feng, Phillip Lippe, Sara Magliacane

TL;DR
This paper presents FIOC-WM, a structured world model that learns object interactions explicitly, leading to more efficient and generalizable policies in object-centric reinforcement learning tasks.
Contribution
The paper introduces FIOC-WM, a novel framework that models object interactions explicitly within a structured world model for improved RL performance.
Findings
FIOC-WM enhances sample efficiency in policy learning.
FIOC-WM improves generalization across tasks.
Explicit interaction modeling is crucial for robust control.
Abstract
Agents that understand objects and their interactions can learn policies that are more robust and transferable. However, most object-centric RL methods factor state by individual objects while leaving interactions implicit. We introduce the Factored Interactive Object-Centric World Model (FIOC-WM), a unified framework that learns structured representations of both objects and their interactions within a world model. FIOC-WM captures environment dynamics with disentangled and modular representations of object interactions, improving sample efficiency and generalization for policy learning. Concretely, FIOC-WM first learns object-centric latents and an interaction structure directly from pixels, leveraging pre-trained vision encoders. The learned world model then decomposes tasks into composable interaction primitives, and a hierarchical policy is trained on top: a high level selects the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Domain Adaptation and Few-Shot Learning
