Query-Efficient Imitation Learning for End-to-End Autonomous Driving

Jiakai Zhang; Kyunghyun Cho

arXiv:1605.06450·cs.LG·May 23, 2016·115 cites

Query-Efficient Imitation Learning for End-to-End Autonomous Driving

Jiakai Zhang, Kyunghyun Cho

PDF

Open Access 1 Repo

TL;DR

This paper introduces SafeDAgger, an extension of DAgger, which improves query efficiency and speeds up learning in autonomous driving by reducing reliance on expensive reference policies.

Contribution

SafeDAgger is a novel, query-efficient imitation learning algorithm that enhances end-to-end autonomous driving by accelerating convergence and reducing reference policy queries.

Findings

01

SafeDAgger requires fewer queries to the reference policy.

02

It achieves faster convergence in a car racing simulator.

03

Automated curriculum learning may contribute to improved training speed.

Abstract

One way to approach end-to-end autonomous driving is to learn a policy function that maps from a sensory input, such as an image frame from a front-facing camera, to a driving action, by imitating an expert driver, or a reference policy. This can be done by supervised learning, where a policy function is tuned to minimize the difference between the predicted and ground-truth actions. A policy function trained in this way however is known to suffer from unexpected behaviours due to the mismatch between the states reachable by the reference policy and trained policy functions. More advanced algorithms for imitation learning, such as DAgger, addresses this issue by iteratively collecting training examples from both reference and trained policies. These algorithms often requires a large number of queries to a reference policy, which is undesirable as the reference policy is often expensive.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mbhenaff/EEN
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Human Pose and Action Recognition · Multimodal Machine Learning Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings