TADPO: Reinforcement Learning Goes Off-road

Zhouchonghao Wu; Raymond Song; Vedant Mundheda; Luis E. Navarro-Serment; Christof Schoenborn; Jeff Schneider

arXiv:2603.05995·cs.RO·March 9, 2026

TADPO: Reinforcement Learning Goes Off-road

Zhouchonghao Wu, Raymond Song, Vedant Mundheda, Luis E. Navarro-Serment, Christof Schoenborn, Jeff Schneider

PDF

Open Access

TL;DR

This paper introduces TADPO, a novel reinforcement learning method that enhances off-road autonomous driving by enabling high-speed, zero-shot sim-to-real transfer on a full-scale vehicle navigating complex terrains.

Contribution

The paper presents TADPO, a new policy gradient approach extending PPO with off-policy guidance, and demonstrates its successful deployment on a real off-road vehicle.

Findings

01

Effective zero-shot sim-to-real transfer achieved

02

First RL-based policy deployed on a full-scale off-road vehicle

03

High-speed navigation in complex terrains demonstrated

Abstract

Off-road autonomous driving poses significant challenges such as navigating unmapped, variable terrain with uncertain and diverse dynamics. Addressing these challenges requires effective long-horizon planning and adaptable control. Reinforcement Learning (RL) offers a promising solution by learning control policies directly from interaction. However, because off-road driving is a long-horizon task with low-signal rewards, standard RL methods are challenging to apply in this setting. We introduce TADPO, a novel policy gradient formulation that extends Proximal Policy Optimization (PPO), leveraging off-policy trajectories for teacher guidance and on-policy trajectories for student exploration. Building on this, we develop a vision-based, end-to-end RL system for high-speed off-road driving, capable of navigating extreme slopes and obstacle-rich terrain. We demonstrate our performance in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robotic Path Planning Algorithms · Robotic Locomotion and Control