MTDrive: Multi-turn Interactive Reinforcement Learning for Autonomous Driving

Xidong Li; Mingyu Guo; Chenchao Xu; Bailin Li; Wenjing Zhu; Yangang Zou; Rui Chen; Zehuan Wang

arXiv:2601.22930·cs.RO·February 2, 2026

MTDrive: Multi-turn Interactive Reinforcement Learning for Autonomous Driving

Xidong Li, Mingyu Guo, Chenchao Xu, Bailin Li, Wenjing Zhu, Yangang Zou, Rui Chen, Zehuan Wang

PDF

Open Access

TL;DR

MTDrive introduces a multi-turn reinforcement learning framework for autonomous driving trajectory planning, enabling iterative refinement based on environmental feedback, which improves performance and efficiency over existing single-turn methods.

Contribution

It proposes a novel multi-turn reasoning framework with mtGRPO for autonomous driving, along with a new dataset and system optimizations to enhance training efficiency.

Findings

01

Outperforms existing methods on NAVSIM benchmark

02

Achieves 2.5x training throughput with system optimizations

03

Demonstrates effective multi-turn trajectory refinement

Abstract

Trajectory planning is a core task in autonomous driving, requiring the prediction of safe and comfortable paths across diverse scenarios. Integrating Multi-modal Large Language Models (MLLMs) with Reinforcement Learning (RL) has shown promise in addressing "long-tail" scenarios. However, existing methods are constrained to single-turn reasoning, limiting their ability to handle complex tasks requiring iterative refinement. To overcome this limitation, we present MTDrive, a multi-turn framework that enables MLLMs to iteratively refine trajectories based on environmental feedback. MTDrive introduces Multi-Turn Group Relative Policy Optimization (mtGRPO), which mitigates reward sparsity by computing relative advantages across turns. We further construct an interactive trajectory understanding dataset from closed-loop simulation to support multi-turn training. Experiments on the NAVSIM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Multimodal Machine Learning Applications · Reinforcement Learning in Robotics