When to Trust Imagination: Adaptive Action Execution for World Action Models

Rui Wang; Yue Zhang; Jiehong Lin; Kuncheng Luo; Jianan Wang; Zhongrui Wang; Xiaojuan Qi

arXiv:2605.06222·cs.RO·May 12, 2026

When to Trust Imagination: Adaptive Action Execution for World Action Models

Rui Wang, Yue Zhang, Jiehong Lin, Kuncheng Luo, Jianan Wang, Zhongrui Wang, Xiaojuan Qi

PDF

TL;DR

This paper introduces an adaptive execution framework for World Action Models in robotics, enabling robots to decide dynamically when to trust their predictions and replan, thereby improving efficiency and robustness in manipulation tasks.

Contribution

The paper proposes FFDC, a lightweight verifier for adaptive action execution, and Mixture-of-Horizon Training, enhancing long-horizon trajectory coverage for WAMs.

Findings

01

Reduces WAM forward passes by 69.10% on RoboTwin

02

Improves success rate by 2.54% over short-chunk baseline on RoboTwin

03

Increases real-world success rate by 35%

Abstract

World Action Models (WAMs) have recently emerged as a promising paradigm for robotic manipulation by jointly predicting future visual observations and future actions. However, current WAMs typically execute a fixed number of predicted actions after each model inference, leaving the robot blind to whether the imagined future remains consistent with the actual physical rollout. In this work, we formulate adaptive WAM execution as a future-reality verification problem: the robot should execute longer when the WAM-predicted future remains reliable, and replan earlier when reality deviates from imagination. To this end, we propose Future Forward Dynamics Causal Attention (FFDC), a lightweight verifier that jointly reasons over predicted future actions, predicted visual dynamics, real observations, and language instructions to estimate whether the remaining action rollout can still be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.