SOLD: Slot Object-Centric Latent Dynamics Models for Relational Manipulation Learning from Pixels
Malte Mosbach, Jan Niklas Ewertz, Angel Villar-Corrales, Sven Behnke

TL;DR
SOLD introduces an object-centric latent dynamics model that enhances interpretability and relational reasoning in model-based RL from pixel inputs, outperforming existing methods in robotic manipulation tasks.
Contribution
The paper presents SOLD, a novel unsupervised object-centric latent dynamics model that improves interpretability and relational reasoning in model-based reinforcement learning.
Findings
Outperforms DreamerV3 and TD-MPC2 in robotic environments
Provides interpretable structured latent space
Enhances relational reasoning for manipulation tasks
Abstract
Learning a latent dynamics model provides a task-agnostic representation of an agent's understanding of its environment. Leveraging this knowledge for model-based reinforcement learning (RL) holds the potential to improve sample efficiency over model-free methods by learning from imagined rollouts. Furthermore, because the latent space serves as input to behavior models, the informative representations learned by the world model facilitate efficient learning of desired skills. Most existing methods rely on holistic representations of the environment's state. In contrast, humans reason about objects and their interactions, predicting how actions will affect specific parts of their surroundings. Inspired by this, we propose Slot-Attention for Object-centric Latent Dynamics (SOLD), a novel model-based RL algorithm that learns object-centric dynamics models in an unsupervised manner from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Topic Modeling
