From Forecasting to Planning: Policy World Model for Collaborative State-Action Prediction
Zhida Zhao, Talas Fu, Yifan Wang, Lijun Wang, Huchuan Lu

TL;DR
This paper introduces the Policy World Model (PWM), a unified framework that integrates world modeling and trajectory planning for autonomous driving, leveraging future state forecasting and collaborative prediction to improve planning reliability.
Contribution
The work presents a novel Policy World Model that unifies world modeling and planning, incorporating a new action-free future state forecasting scheme and a dynamic token generation mechanism.
Findings
PWM matches or exceeds state-of-the-art methods using only front camera input.
The proposed approach demonstrates improved planning reliability through anticipatory perception.
Efficient video forecasting is achieved with a context-guided tokenizer and adaptive focal loss.
Abstract
Despite remarkable progress in driving world models, their potential for autonomous systems remains largely untapped: the world models are mostly learned for world simulation and decoupled from trajectory planning. While recent efforts aim to unify world modeling and planning in a single framework, the synergistic facilitation mechanism of world modeling for planning still requires further exploration. In this work, we introduce a new driving paradigm named Policy World Model (PWM), which not only integrates world modeling and trajectory planning within a unified architecture, but is also able to benefit planning using the learned world knowledge through the proposed action-free future state forecasting scheme. Through collaborative state-action prediction, PWM can mimic the human-like anticipatory perception, yielding more reliable planning performance. To facilitate the efficiency of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Multimodal Machine Learning Applications · Human Motion and Animation
