Model-Based Policy Adaptation for Closed-Loop End-to-End Autonomous Driving

Haohong Lin; Yunzhi Zhang; Wenhao Ding; Jiajun Wu; Ding Zhao

arXiv:2511.21584·cs.RO·November 27, 2025

Model-Based Policy Adaptation for Closed-Loop End-to-End Autonomous Driving

Haohong Lin, Yunzhi Zhang, Wenhao Ding, Jiajun Wu, Ding Zhao

PDF

Open Access

TL;DR

This paper introduces Model-based Policy Adaptation (MPA), a framework that improves the robustness and safety of end-to-end autonomous driving models in closed-loop scenarios by generating diverse scenarios and refining policies accordingly.

Contribution

The paper presents a novel MPA framework that uses counterfactual trajectories, a diffusion-based policy adapter, and a Q value model to enhance autonomous driving in closed-loop settings.

Findings

01

MPA significantly improves performance in closed-loop autonomous driving.

02

The approach enhances safety and robustness in out-of-domain scenarios.

03

Counterfactual data scale and inference strategies impact effectiveness.

Abstract

End-to-end (E2E) autonomous driving models have demonstrated strong performance in open-loop evaluations but often suffer from cascading errors and poor generalization in closed-loop settings. To address this gap, we propose Model-based Policy Adaptation (MPA), a general framework that enhances the robustness and safety of pretrained E2E driving agents during deployment. MPA first generates diverse counterfactual trajectories using a geometry-consistent simulation engine, exposing the agent to scenarios beyond the original dataset. Based on this generated data, MPA trains a diffusion-based policy adapter to refine the base policy's predictions and a multi-step Q value model to evaluate long-term outcomes. At inference time, the adapter proposes multiple trajectory candidates, and the Q value model selects the one with the highest expected utility. Experiments on the nuScenes benchmark…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Reinforcement Learning in Robotics · Adversarial Robustness in Machine Learning