GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS

Saman Kazemkhani; Aarav Pandya; Daphne Cornelisse; Brennan Shacklett,; Eugene Vinitsky

arXiv:2408.01584·cs.AI·February 19, 2025

GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS

Saman Kazemkhani, Aarav Pandya, Daphne Cornelisse, Brennan Shacklett,, Eugene Vinitsky

PDF

Open Access 1 Repo 3 Reviews

TL;DR

GPUDrive is a GPU-accelerated multi-agent simulator capable of over a million simulation steps per second, enabling scalable reinforcement learning for complex multi-agent planning tasks.

Contribution

The paper introduces GPUDrive, a high-performance, flexible multi-agent simulation platform that significantly accelerates the training and evaluation of multi-agent systems.

Findings

01

Achieved over 1 million simulation steps per second.

02

Enabled training of RL agents on large-scale datasets in minutes.

03

Demonstrated scalable multi-agent planning with complex behaviors.

Abstract

Multi-agent learning algorithms have been successful at generating superhuman planning in various games but have had limited impact on the design of deployed multi-agent planners. A key bottleneck in applying these techniques to multi-agent planning is that they require billions of steps of experience. To enable the study of multi-agent planning at scale, we present GPUDrive. GPUDrive is a GPU-accelerated, multi-agent simulator built on top of the Madrona Game Engine capable of generating over a million simulation steps per second. Observation, reward, and dynamics functions are written directly in C++, allowing users to define complex, heterogeneous agent behaviors that are lowered to high-performance CUDA. Despite these low-level optimizations, GPUDrive is fully accessible through Python, offering a seamless and efficient workflow for multi-agent, closed-loop simulation. Using…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 6Confidence 4

Strengths

- The proposed simulator has the flexibility to handle multiple modalities of sensor data. - The authors have implemented ways to reduce the memory footprint due to the large number of agents and observation space using algorithms like Bounding Volume Hierarchy (to exclude certain agent pairs for collision checking) and polyline decimation to approximate the straight polylines. - The trained agents are claimed to be useful for out-of-distribution tests for the driving agents. - The authors p

Weaknesses

- The paper does not provide simple IDM (intelligent driving models) agents that can be sometimes practical to have basic reactivity to the ego-agent. - The authors mention that the current work is limited in properly utilizing the generated samples for optimal training. - Just a thought: The implementation is in C++ and it provides a binding interface with Python environments. It would have been nice to have a mono-language (primarily Python based) tool as the model training and other related

Reviewer 02Rating 8Confidence 4

Strengths

1. A multi agent simulator accelerated on the GPU iteration of over a million steps per second. 2. Very well written and structured code to run any experiment easily with a lot of easy experimentation code readily available. 3. Extensive results analyzing the sampling frequency of the simulation.

Weaknesses

1. Figure 2 needs a better caption and an explanation 2. Designed to fit one exact dataset. A section explaining the effort required to integrate other datasets is desirable.

Reviewer 03Rating 8Confidence 4

Strengths

- The proposed simulator improves over alternatives in terms of sample efficiency. All of the design choices appear reasonable, while the underlying source code with pre-trained driving baselines will be released. - The ability to load real-world driving datasets is extremely useful, while providing a variety of observation spaces is a great feature. - Transparency about current limitation of the benchmark are very helpful for user adaptation.

Weaknesses

- While the focus of this paper is on providing a novel simulator, it would be very interesting to see some more complex behavior over longer time-horizons to fully capture the capabilities unlocked by the simulator (e.g. training a single agent policy with higher velocity limit to weave through a simulated traffic scene, etc.) - Showcasing such behavior would likely require addressing the “Absence of a map” limitation raised in the paper, in order to formulate more sophisticated reward function

Code & Models

Repositories

emerge-lab/gpudrive
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSimulation Techniques and Applications · Autonomous Vehicle Technology and Safety

MethodsBalanced Selection