Is the Future Compatible? Diagnosing Dynamic Consistency in World Action Models

Bo-Kai Ruan; Teng-Fang Hsiao; Ling Lo; Hong-Han Shuai

arXiv:2605.07514·cs.RO·May 11, 2026

Is the Future Compatible? Diagnosing Dynamic Consistency in World Action Models

Bo-Kai Ruan, Teng-Fang Hsiao, Ling Lo, Hong-Han Shuai

PDF

TL;DR

This paper investigates the reliability of World Action Models by examining action-state consistency, proposing a new test-time selection strategy, and demonstrating improved success rates in robotic tasks.

Contribution

It introduces action-state consistency as a diagnostic for WAM reliability and proposes a value-free consensus method for better rollout selection.

Findings

01

Action-state consistency distinguishes successful and failed rollouts.

02

Background collapse can cause deceptive consistency in low-dynamics trajectories.

03

Consensus ranking improves success rates on RoboCasa and RoboTwin 2.0.

Abstract

World Action Models (WAMs) enable decision-making through imagined rollouts by predicting future observations and actions. However, the reliability of these imagined futures remains under-examined: is a generated future merely visually plausible, or is it dynamically compatible with the action sequence it claims to model? In this work, we identify action-state consistency, the alignment between predicted actions and induced state transitions, as a missing reliability axis for WAMs. Through a systematic study across representative joint-prediction and inverse-dynamics models, we find that action-state consistency systematically separates successful and failed rollouts across many tasks and follows similar success-failure trends as learned value estimates. These results suggest that consistency captures decision-relevant structure beyond visual realism. We further identify background…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.