When to Trust Imagination: Adaptive Action Execution for World Action Models
Rui Wang, Yue Zhang, Jiehong Lin, Kuncheng Luo, Jianan Wang, Zhongrui Wang, Xiaojuan Qi

TL;DR
This paper introduces an adaptive execution framework for World Action Models in robotics, enabling robots to decide dynamically when to trust their predictions and replan, thereby improving efficiency and robustness in manipulation tasks.
Contribution
The paper proposes FFDC, a lightweight verifier for adaptive action execution, and Mixture-of-Horizon Training, enhancing long-horizon trajectory coverage for WAMs.
Findings
Reduces WAM forward passes by 69.10% on RoboTwin
Improves success rate by 2.54% over short-chunk baseline on RoboTwin
Increases real-world success rate by 35%
Abstract
World Action Models (WAMs) have recently emerged as a promising paradigm for robotic manipulation by jointly predicting future visual observations and future actions. However, current WAMs typically execute a fixed number of predicted actions after each model inference, leaving the robot blind to whether the imagined future remains consistent with the actual physical rollout. In this work, we formulate adaptive WAM execution as a future-reality verification problem: the robot should execute longer when the WAM-predicted future remains reliable, and replan earlier when reality deviates from imagination. To this end, we propose Future Forward Dynamics Causal Attention (FFDC), a lightweight verifier that jointly reasons over predicted future actions, predicted visual dynamics, real observations, and language instructions to estimate whether the remaining action rollout can still be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
