Model-Based Policy Adaptation for Closed-Loop End-to-End Autonomous Driving
Haohong Lin, Yunzhi Zhang, Wenhao Ding, Jiajun Wu, Ding Zhao

TL;DR
This paper introduces Model-based Policy Adaptation (MPA), a framework that improves the robustness and safety of end-to-end autonomous driving models in closed-loop scenarios by generating diverse scenarios and refining policies accordingly.
Contribution
The paper presents a novel MPA framework that uses counterfactual trajectories, a diffusion-based policy adapter, and a Q value model to enhance autonomous driving in closed-loop settings.
Findings
MPA significantly improves performance in closed-loop autonomous driving.
The approach enhances safety and robustness in out-of-domain scenarios.
Counterfactual data scale and inference strategies impact effectiveness.
Abstract
End-to-end (E2E) autonomous driving models have demonstrated strong performance in open-loop evaluations but often suffer from cascading errors and poor generalization in closed-loop settings. To address this gap, we propose Model-based Policy Adaptation (MPA), a general framework that enhances the robustness and safety of pretrained E2E driving agents during deployment. MPA first generates diverse counterfactual trajectories using a geometry-consistent simulation engine, exposing the agent to scenarios beyond the original dataset. Based on this generated data, MPA trains a diffusion-based policy adapter to refine the base policy's predictions and a multi-step Q value model to evaluate long-term outcomes. At inference time, the adapter proposes multiple trajectory candidates, and the Q value model selects the one with the highest expected utility. Experiments on the nuScenes benchmark…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Reinforcement Learning in Robotics · Adversarial Robustness in Machine Learning
