Pseudo-Expert Regularized Offline RL for End-to-End Autonomous Driving in Photorealistic Closed-Loop Environments

Chihiro Noguchi; Takaki Yamamoto

arXiv:2512.18662·cs.RO·April 10, 2026

Pseudo-Expert Regularized Offline RL for End-to-End Autonomous Driving in Photorealistic Closed-Loop Environments

Chihiro Noguchi, Takaki Yamamoto

PDF

1 Repo

TL;DR

This paper proposes a pseudo-expert regularized offline reinforcement learning framework for end-to-end autonomous driving, improving safety and performance in photorealistic simulation environments.

Contribution

It introduces a novel offline RL method that uses pseudo ground-truth trajectories for behavior regularization, enhancing stability and safety in autonomous driving models.

Findings

01

Significant reduction in collision rates compared to imitation learning baselines.

02

Improved route completion rates in neural rendering simulation environment.

03

Effective stabilization of value learning through pseudo ground-truth trajectories.

Abstract

End-to-end (E2E) autonomous driving models that take only camera images as input and directly predict a future trajectory are appealing for their computational efficiency and potential for improved generalization via unified optimization; however, persistent failure modes remain due to reliance on imitation learning (IL). While online reinforcement learning (RL) could mitigate IL-induced issues, the computational burden of neural rendering-based simulation and large E2E networks renders iterative reward and hyperparameter tuning costly. We introduce a camera-only E2E offline RL framework that performs no additional exploration and trains solely on a fixed simulator dataset. Offline RL offers strong data efficiency and rapid experimental iteration, yet is susceptible to instability from overestimation on out-of-distribution (OOD) actions. To address this, we construct pseudo ground-truth…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ToyotaInfoTech/PEBC
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.