RLFTSim: Realistic and Controllable Multi-Agent Traffic Simulation via Reinforcement Learning Fine-Tuning

Ehsan Ahmadi; Hunter Schofield; Behzad Khamidehi; Fazel Arasteh; Jinjun Shan; Lili Mou; Dongfeng Bai; Kasra Rezaee

arXiv:2605.19033·cs.RO·May 20, 2026

RLFTSim: Realistic and Controllable Multi-Agent Traffic Simulation via Reinforcement Learning Fine-Tuning

Ehsan Ahmadi, Hunter Schofield, Behzad Khamidehi, Fazel Arasteh, Jinjun Shan, Lili Mou, Dongfeng Bai, Kasra Rezaee

PDF

2 Repos

TL;DR

RLFTSim is a reinforcement learning framework that fine-tunes traffic simulators to better match real-world data and controllability, achieving state-of-the-art realism with fewer samples.

Contribution

It introduces a reward-based fine-tuning method for traffic simulation that improves realism and controllability, requiring fewer samples than heuristic methods.

Findings

01

Achieves state-of-the-art realism in traffic simulation.

02

Requires significantly fewer samples than heuristic search methods.

03

Effectively distills goal-conditioned controllability in simulations.

Abstract

Supervised open-loop training has been widely adopted for training traffic simulation models; however, it fails to capture the inherently dynamic, multi-agent interactions common in complex driving scenarios. We introduce RLFTSim, a reinforcement-learning-based fine-tuning framework that enhances scenario realism by aligning simulator rollouts with real-world data distributions and provides a method for distilling goal-conditioned controllability in scenario generation. We instantiate RLFTSim on top of a pre-trained simulation model, design a reward that balances fidelity and controllability, and perform comprehensive experiments on the Waymo Open Motion Dataset. Our results show improvements in realism, achieving state-of-the-art performance. Compared with other heuristic search-based fine-tuning methods, RLFTSim requires significantly fewer samples due to a proposed low-variance and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.