AGWM: Affordance-Grounded World Models for Environments with Compositional Prerequisites
Qinshi Zhang (1), Weipeng Deng (2), Zhihan Jiang (3), Jiaming Qu (4), Qianren Li (5), Weitao Xu (5), Ray LC (5) ((1) University of California, San Diego, (2) University of Hong Kong, (3) Columbia University, (4) Amazon, (5) City University of Hong Kong)

TL;DR
This paper introduces AGWM, a world model that explicitly tracks action prerequisites as a DAG to improve multi-step predictions and generalization in environments with dynamic action affordances.
Contribution
The paper proposes AGWM, a novel affordance-grounded world model that captures dynamic action prerequisites using a DAG structure, addressing limitations of standard models.
Findings
Lower multi-step prediction error in experiments
Better generalization to novel configurations
Enhanced interpretability of action dependencies
Abstract
In model-based learning, the agent learns behaviors by simulating trajectories based on world model predictions. Standard world models typically learn a stationary transition function that maps states and actions to next states, when an action and an outcome frequently co-occur in training data, the model tends to internalize this correlation as a general causal rule while ignoring action preconditions. In interactive environments, however, agent actions can reshape the future affordance space. At each timestep, an action may becomes executable only after its prerequisites are met, or non-executable when they are destroyed. We term such events structure-changing events (SC events). As a result, a conventional world model often fails to determine whether a given action is executable in the current state, especially in multi-step predictions. Each imagined step is conditioned on an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
