Binding Actions to Objects in World Models
Ondrej Biza, Robert Platt, Jan-Willem van de Meent, Lawson L. S. Wong, and Thomas Kipf

TL;DR
This paper introduces two attention mechanisms, soft and hard, for binding actions to objects in structured world models, improving object separation and task performance, and enabling interpretability.
Contribution
It proposes novel action-attention mechanisms for object binding in world models, demonstrating their effectiveness in different environments and enhancing interpretability.
Findings
Hard attention improves object separation in grid-world environments.
Soft attention boosts performance in robotic manipulation tasks.
Attention weights enable interpretation of the model's focus.
Abstract
We study the problem of binding actions to objects in object-factored world models using action-attention mechanisms. We propose two attention mechanisms for binding actions to objects, soft attention and hard attention, which we evaluate in the context of structured world models for five environments. Our experiments show that hard attention helps contrastively-trained structured world models to learn to separate individual objects in an object-based grid-world environment. Further, we show that soft attention increases performance of factored world models trained on a robotic manipulation task. The learned action attention weights can be used to interpret the factored world model as the attention focuses on the manipulated object in the environment.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling
