Goal-constrained Sparse Reinforcement Learning for End-to-End Driving
Pranav Agarwal, Pierre de Beaucorps, Raoul de Charette

TL;DR
This paper introduces a goal-constrained sparse reward approach combined with curriculum learning for end-to-end driving, enabling better generalization and longer driving distances with minimal reward engineering.
Contribution
It proposes a novel curriculum learning framework using only navigation view maps and concurrent policies for improved generalization in goal-constrained sparse reinforcement learning.
Findings
Successfully generalizes to unseen road layouts
Drives significantly longer than during training
Reduces reliance on complex reward engineering
Abstract
Deep reinforcement Learning for end-to-end driving is limited by the need of complex reward engineering. Sparse rewards can circumvent this challenge but suffers from long training time and leads to sub-optimal policy. In this work, we explore full-control driving with only goal-constrained sparse reward and propose a curriculum learning approach for end-to-end driving using only navigation view maps that benefit from small virtual-to-real domain gap. To address the complexity of multiple driving policies, we learn concurrent individual policies selected at inference by a navigation system. We demonstrate the ability of our proposal to generalize on unseen road layout, and to drive significantly longer than in the training.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Reinforcement Learning in Robotics · Transportation and Mobility Innovations
