From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving

Antonio Guillen-Perez

arXiv:2508.07029·cs.LG·August 28, 2025

From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving

Antonio Guillen-Perez

PDF

Open Access

TL;DR

This paper compares imitation learning and offline reinforcement learning for autonomous driving, demonstrating that offline RL significantly improves policy robustness and safety over behavioral cloning in large-scale real-world datasets.

Contribution

It introduces a comprehensive pipeline for offline learning in autonomous driving, showing that offline RL with CQL outperforms behavioral cloning in robustness and safety on real-world data.

Findings

01

CQL achieves 3.2x higher success rate than BC

02

CQL reduces collision rate by 7.4x

03

Offline RL enhances robustness in long-horizon driving scenarios

Abstract

Learning robust driving policies from large-scale, real-world datasets is a central challenge in autonomous driving, as online data collection is often unsafe and impractical. While Behavioral Cloning (BC) offers a straightforward approach to imitation learning, policies trained with BC are notoriously brittle and suffer from compounding errors in closed-loop execution. This work presents a comprehensive pipeline and a comparative study to address this limitation. We first develop a series of increasingly sophisticated BC baselines, culminating in a Transformer-based model that operates on a structured, entity-centric state representation. While this model achieves low imitation loss, we show that it still fails in long-horizon simulations. We then demonstrate that by applying a state-of-the-art Offline Reinforcement Learning algorithm, Conservative Q-Learning (CQL), to the same data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Reinforcement Learning in Robotics · Adversarial Robustness in Machine Learning