Efficient Sampling-Based Maximum Entropy Inverse Reinforcement Learning   with Application to Autonomous Driving

Zheng Wu; Liting Sun; Wei Zhan; Chenyu Yang; Masayoshi Tomizuka

arXiv:2006.13704·cs.RO·June 25, 2020

Efficient Sampling-Based Maximum Entropy Inverse Reinforcement Learning with Application to Autonomous Driving

Zheng Wu, Liting Sun, Wei Zhan, Chenyu Yang, Masayoshi Tomizuka

PDF

Open Access

TL;DR

This paper introduces an efficient sampling-based maximum-entropy IRL algorithm for autonomous driving that learns reward functions directly in the continuous domain from real traffic data, improving prediction accuracy and convergence.

Contribution

It presents a novel continuous-domain IRL method with an efficient trajectory sampler, enhancing learning speed and accuracy over existing algorithms.

Findings

01

Achieves more accurate reward learning from real driving data.

02

Converges faster than baseline IRL algorithms.

03

Generalizes well to different driving scenarios.

Abstract

In the past decades, we have witnessed significant progress in the domain of autonomous driving. Advanced techniques based on optimization and reinforcement learning (RL) become increasingly powerful at solving the forward problem: given designed reward/cost functions, how should we optimize them and obtain driving policies that interact with the environment safely and efficiently. Such progress has raised another equally important question: \emph{what should we optimize}? Instead of manually specifying the reward functions, it is desired that we can extract what human drivers try to optimize from real traffic data and assign that to autonomous vehicles to enable more naturalistic and transparent interaction between humans and intelligent agents. To address this issue, we present an efficient sampling-based maximum-entropy inverse reinforcement learning (IRL) algorithm in this paper.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Reinforcement Learning in Robotics · Traffic control and management

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings