TL;DR
RLFTSim is a reinforcement learning framework that fine-tunes traffic simulators to better match real-world data and controllability, achieving state-of-the-art realism with fewer samples.
Contribution
It introduces a reward-based fine-tuning method for traffic simulation that improves realism and controllability, requiring fewer samples than heuristic methods.
Findings
Achieves state-of-the-art realism in traffic simulation.
Requires significantly fewer samples than heuristic search methods.
Effectively distills goal-conditioned controllability in simulations.
Abstract
Supervised open-loop training has been widely adopted for training traffic simulation models; however, it fails to capture the inherently dynamic, multi-agent interactions common in complex driving scenarios. We introduce RLFTSim, a reinforcement-learning-based fine-tuning framework that enhances scenario realism by aligning simulator rollouts with real-world data distributions and provides a method for distilling goal-conditioned controllability in scenario generation. We instantiate RLFTSim on top of a pre-trained simulation model, design a reward that balances fidelity and controllability, and perform comprehensive experiments on the Waymo Open Motion Dataset. Our results show improvements in realism, achieving state-of-the-art performance. Compared with other heuristic search-based fine-tuning methods, RLFTSim requires significantly fewer samples due to a proposed low-variance and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
