BridgeSim: Unveiling the OL-CL Gap in End-to-End Autonomous Driving

Seth Z. Zhao; Luobin Wang; Hongwei Ruan; Yuxin Bao; Yilan Chen; Ziyang Leng; Abhijit Ravichandran; Honglin He; Zewei Zhou; Xu Han; Abhishek Peri; Zhiyu Huang; Pranav Desai; Henrik Christensen; Jiaqi Ma; Bolei Zhou

arXiv:2604.10856·cs.RO·April 14, 2026

BridgeSim: Unveiling the OL-CL Gap in End-to-End Autonomous Driving

Seth Z. Zhao, Luobin Wang, Hongwei Ruan, Yuxin Bao, Yilan Chen, Ziyang Leng, Abhijit Ravichandran, Honglin He, Zewei Zhou, Xu Han, Abhishek Peri, Zhiyu Huang, Pranav Desai, Henrik Christensen, Jiaqi Ma, Bolei Zhou

PDF

TL;DR

This paper identifies the causes of the open-loop to closed-loop gap in autonomous driving policies and proposes a test-time adaptation framework to improve transferability and real-world performance.

Contribution

It uncovers the root causes of the OL-CL gap and introduces a TTA method that calibrates observational shift and enforces temporal consistency.

Findings

01

TTA reduces planning biases and improves CL deployment performance.

02

OL policies often learn biased Q-values neglecting reactive behaviors.

03

Standard OL evaluation protocols may overlook critical CL deployment challenges.

Abstract

Open-loop (OL) to closed-loop (CL) gap (OL-CL gap) exists when OL-pretrained policies scoring high in OL evaluations fail to transfer effectively in closed-loop (CL) deployment. In this paper, we unveil the root causes of this systemic failure and propose a practical remedy. Specifically, we demonstrate that OL policies suffer from Observational Domain Shift and Objective Mismatch. We show that while the former is largely recoverable with adaptation techniques, the latter creates a structural inability to model complex reactive behaviors, which forms the primary OL-CL gap. We find that a wide range of OL policies learn a biased Q-value estimator that neglects both the reactive nature of CL simulations and the temporal awareness needed to reduce compounding errors. To this end, we propose a Test-Time Adaptation (TTA) framework that calibrates observational shift, reduces state-action…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.