Building reliable sim driving agents by scaling self-play

Daphne Cornelisse; Aarav Pandya; Kevin Joseph; Joseph Su\'arez; Eugene Vinitsky

arXiv:2502.14706·cs.AI·May 21, 2025

Building reliable sim driving agents by scaling self-play

Daphne Cornelisse, Aarav Pandya, Kevin Joseph, Joseph Su\'arez, Eugene Vinitsky

PDF

Open Access 1 Repo 3 Models

TL;DR

This paper presents a scalable self-play training approach for simulation agents in autonomous vehicle testing, achieving high reliability and generalization with minimal training time, and open-sourcing the resulting agents.

Contribution

We introduce a scalable self-play method trained on thousands of scenarios for reliable autonomous vehicle simulation agents, demonstrating high performance and quick adaptability.

Findings

01

Achieved 99.8% goal completion rate with less than 0.8% collisions on 10,000 scenarios.

02

Trained agents solve nearly the full dataset within a day on a single GPU.

03

Agents show partial robustness to out-of-distribution scenes and can be fine-tuned rapidly.

Abstract

Simulation agents are essential for designing and testing systems that interact with humans, such as autonomous vehicles (AVs). These agents serve various purposes, from benchmarking AV performance to stress-testing system limits, but all applications share one key requirement: reliability. To enable sound experimentation, a simulation agent must behave as intended. It should minimize actions that may lead to undesired outcomes, such as collisions, which can distort the signal-to-noise ratio in analyses. As a foundation for reliable sim agents, we propose scaling self-play to thousands of scenarios on the Waymo Open Motion Dataset under semi-realistic limits on human perception and control. Training from scratch on a single GPU, our agents solve almost the full training set within a day. They generalize to unseen test scenes, achieving a 99.8% goal completion rate with less than 0.8%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

emerge-lab/gpudrive
jaxOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvacuation and Crowd Dynamics

MethodsSparse Evolutionary Training