SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model

Shuhan Tan; John Lambert; Hong Jeon; Sakshum Kulshrestha; Yijing Bai; Jing Luo; Dragomir Anguelov; Mingxing Tan; Chiyu Max Jiang

arXiv:2506.21976·cs.LG·June 30, 2025

SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model

Shuhan Tan, John Lambert, Hong Jeon, Sakshum Kulshrestha, Yijing Bai, Jing Luo, Dragomir Anguelov, Mingxing Tan, Chiyu Max Jiang

PDF

TL;DR

SceneDiffuser++ is a novel end-to-end generative model that enables realistic, city-scale traffic simulation by integrating scene generation, agent behavior, occlusion reasoning, and environment dynamics, supporting trip-level validation.

Contribution

It introduces SceneDiffuser++, the first comprehensive generative world model capable of seamless city-scale traffic simulation from start to end.

Findings

01

Demonstrates realistic city-scale traffic simulation capabilities.

02

Outperforms existing methods in long-term simulation realism.

03

Validated on an extended Waymo dataset for trip-level accuracy.

Abstract

The goal of traffic simulation is to augment a potentially limited amount of manually-driven miles that is available for testing and validation, with a much larger amount of simulated synthetic miles. The culmination of this vision would be a generative simulated city, where given a map of the city and an autonomous vehicle (AV) software stack, the simulator can seamlessly simulate the trip from point A to point B by populating the city around the AV and controlling all aspects of the scene, from animating the dynamic agents (e.g., vehicles, pedestrians) to controlling the traffic light states. We refer to this vision as CitySim, which requires an agglomeration of simulation technologies: scene generation to populate the initial scene, agent behavior modeling to animate the scene, occlusion reasoning, dynamic scene generation to seamlessly spawn and remove agents, and environment…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.