TL;DR
This paper proposes a pseudo-expert regularized offline reinforcement learning framework for end-to-end autonomous driving, improving safety and performance in photorealistic simulation environments.
Contribution
It introduces a novel offline RL method that uses pseudo ground-truth trajectories for behavior regularization, enhancing stability and safety in autonomous driving models.
Findings
Significant reduction in collision rates compared to imitation learning baselines.
Improved route completion rates in neural rendering simulation environment.
Effective stabilization of value learning through pseudo ground-truth trajectories.
Abstract
End-to-end (E2E) autonomous driving models that take only camera images as input and directly predict a future trajectory are appealing for their computational efficiency and potential for improved generalization via unified optimization; however, persistent failure modes remain due to reliance on imitation learning (IL). While online reinforcement learning (RL) could mitigate IL-induced issues, the computational burden of neural rendering-based simulation and large E2E networks renders iterative reward and hyperparameter tuning costly. We introduce a camera-only E2E offline RL framework that performs no additional exploration and trains solely on a fixed simulator dataset. Offline RL offers strong data efficiency and rapid experimental iteration, yet is susceptible to instability from overestimation on out-of-distribution (OOD) actions. To address this, we construct pseudo ground-truth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
