ReconDreamer-RL: Enhancing Reinforcement Learning via Diffusion-based Scene Reconstruction
Chaojun Ni, Guosheng Zhao, Xiaofeng Wang, Zheng Zhu, Wenkang Qin, Xinze Chen, Guanghong Jia, Guan Huang, Wenjun Mei

TL;DR
ReconDreamer-RL integrates diffusion-based scene reconstruction with reinforcement learning to create realistic, diverse driving scenarios, significantly reducing collisions in autonomous driving models.
Contribution
The paper introduces ReconDreamer-RL, combining diffusion priors and physical models for improved scene reconstruction and corner-case scenario generation in autonomous driving training.
Findings
5x reduction in collision ratio compared to imitation learning
Enhanced simulation-to-reality transfer in reinforcement learning
Generation of diverse corner-case traffic scenarios
Abstract
Reinforcement learning for training end-to-end autonomous driving models in closed-loop simulations is gaining growing attention. However, most simulation environments differ significantly from real-world conditions, creating a substantial simulation-to-reality (sim2real) gap. To bridge this gap, some approaches utilize scene reconstruction techniques to create photorealistic environments as a simulator. While this improves realistic sensor simulation, these methods are inherently constrained by the distribution of the training data, making it difficult to render high-quality sensor data for novel trajectories or corner case scenarios. Therefore, we propose ReconDreamer-RL, a framework designed to integrate video diffusion priors into scene reconstruction to aid reinforcement learning, thereby enhancing end-to-end autonomous driving training. Specifically, in ReconDreamer-RL, we…
Peer Reviews
Decision·Submitted to ICLR 2026
- **Significance and Scope.** The paper addresses an important and timely problem in autonomous driving—closed-loop, end-to-end learning—by coupling photorealistic simulation with reinforcement learning in a way that directly targets deployment-relevant failures. - **Comprehensive Qualitative Evaluation.** The study presents rich qualitative evidence across multiple benchmarks and driving scenarios, helping to substantiate generality and offering interpretable insights through scenario visualiza
Despite the strengths mentioned above, I have some significant concerns on the technical contribution claimed by the paper, listed in this and the question section below: - **Unclear presentation of the ReconSimulator**. The core technical contribution of this work comes from this ReconSimulator, however, the details of this simulator is unclear. For instance, How is the diffusion prior incorporated? From section 3.3, it seems that the ReconSimulator use a training checkpoint rendered by 3DGS a
1. The paper introduces a novel method of integrating video diffusion priors directly into the reinforcement learning loop, creating a dynamic simulator that effectively addresses the sim2real gap. 2. It proposes a well-structured solution with dedicated modules (DAA and CTG) to systematically generate adversarial corner cases and mitigate common training data biases. 3. The framework demonstrates a significant and quantifiable performance improvement, achieving a 5x reduction in collision rat
1. Lack of originality. Most techniques proposed in the paper have already been well studied in other works. Using video models to boost novel-trajcectory reconstruction performance is studied in many works such as Drivedreamer4d[1]; Decoupled static and dynamic scene representation is proposed in MTGS[2], OmniRe[3]; Adversary agent interaction is studied in a line of research featuring safty-critical interaction such as DiffScene[4]. 2. Missing details in the dynamic adversary agent section. H
1. The paper shows strong empirical gains on collision metrics across standard and corner‑case evaluations; large gap over both IL baselines and RAD. 2. RL‑friendly simulator: 3DGS‑based with diffusion priors while retaining real‑time speed (Table 2). This addresses a common bottleneck in closed‑loop training.
1. The policy is evaluated in the same simulator family that was fine‑tuned using diffusion priors, and many corner cases are generated by the paper’s own procedures (DAA, CTG). The real sim‑to‑real implications remain untested; no closed‑loop evaluation outside the authors’ reconstructions (e.g., Bench2Drive/nuPlan/CARLA‑v2 evaluations or any real‑vehicle trials). 2. Metrics are limited for driving quality. Collisions are critical, but comfort (jerk/accel), traffic‑rule compliance, and route co
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Reinforcement Learning in Robotics · Traffic control and management
