SCORP: Scene-Consistent Multi-agent Diffusion Planning with Stable Online Reinforcement Post-Training for Cooperative Driving

Haojie Bai; Aimin Li; Ruoyu Yao; Xiongwei Zhao; Tingting Zhang; Xing Zhang; Lin Gao; and Jun Ma

arXiv:2604.11734·cs.RO·May 12, 2026

SCORP: Scene-Consistent Multi-agent Diffusion Planning with Stable Online Reinforcement Post-Training for Cooperative Driving

Haojie Bai, Aimin Li, Ruoyu Yao, Xiongwei Zhao, Tingting Zhang, Xing Zhang, Lin Gao, and Jun Ma

PDF

TL;DR

SCORP is a novel scene-consistent multi-agent diffusion planning method with stable online reinforcement learning, significantly improving safety and efficiency in cooperative driving scenarios.

Contribution

The paper introduces a scene-conditioned multi-agent denoising architecture and a stable post-training RL framework for cooperative driving, addressing scene consistency and stability issues.

Findings

01

SCORP outperforms open-source baselines on WOMD in safety and efficiency.

02

The method achieves 10.47%-28.26% improvements in safety metrics.

03

SCORP provides consistent gains in driving safety and traffic efficiency.

Abstract

Cooperative driving is a safety- and efficiency-critical task that requires the coordination of diverse, interaction-realistic multi-agent trajectories. Although existing diffusion-based methods can capture multimodal behaviors from demonstrations, they often exhibit weak scene consistency and poor alignment with closed-loop cooperative objectives. This makes post-training necessary for further improvement, yet achieving stable online post-training in reactive multi-agent environments remains challenging. In this paper, we propose SCORP, a scene-consistent multi-agent diffusion planner with stable online reinforcement learning (RL) post-training for cooperative driving. For pre-training, we develop a scene-conditioned multi-agent denoising architecture that couples inter-agent self-attention with a dual-path conditioning mechanism: cross-attention provides direct scene-information…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.