WorldRFT: Latent World Model Planning with Reinforcement Fine-Tuning for Autonomous Driving

Pengxuan Yang; Ben Lu; Zhongpu Xia; Chao Han; Yinfeng Gao; Teng Zhang; Kun Zhan; XianPeng Lang; Yupeng Zheng; Qichao Zhang

arXiv:2512.19133·cs.RO·December 23, 2025

WorldRFT: Latent World Model Planning with Reinforcement Fine-Tuning for Autonomous Driving

Pengxuan Yang, Ben Lu, Zhongpu Xia, Chao Han, Yinfeng Gao, Teng Zhang, Kun Zhan, XianPeng Lang, Yupeng Zheng, Qichao Zhang

PDF

Open Access 1 Video

TL;DR

WorldRFT introduces a planning-focused latent world model for autonomous driving, integrating hierarchical planning, local refinement, and reinforcement fine-tuning to improve safety and performance without relying on perception annotations.

Contribution

The paper proposes WorldRFT, a novel framework that aligns scene representation with planning using hierarchical decomposition, local refinement, and reinforcement learning fine-tuning for autonomous driving.

Findings

01

Achieves state-of-the-art results on nuScenes and NavSim benchmarks.

02

Reduces collision rates by 83% on nuScenes.

03

Performs competitively with LiDAR-based methods using camera-only input.

Abstract

Latent World Models enhance scene representation through temporal self-supervised learning, presenting a perception annotation-free paradigm for end-to-end autonomous driving. However, the reconstruction-oriented representation learning tangles perception with planning tasks, leading to suboptimal optimization for planning. To address this challenge, we propose WorldRFT, a planning-oriented latent world model framework that aligns scene representation learning with planning via a hierarchical planning decomposition and local-aware interactive refinement mechanism, augmented by reinforcement learning fine-tuning (RFT) to enhance safety-critical policy performance. Specifically, WorldRFT integrates a vision-geometry foundation model to improve 3D spatial awareness, employs hierarchical planning task decomposition to guide representation optimization, and utilizes local-aware iterative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

WorldRFT: Latent World Model Planning with Reinforcement Fine-Tuning for Autonomous Driving· underline

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Robotic Path Planning Algorithms · Reinforcement Learning in Robotics