Scaling Vision-based End-to-End Driving with Multi-View Attention   Learning

Yi Xiao; Felipe Codevilla; Diego Porres; Antonio M. Lopez

arXiv:2302.03198·cs.CV·July 25, 2023

Scaling Vision-based End-to-End Driving with Multi-View Attention Learning

Yi Xiao, Felipe Codevilla, Diego Porres, Antonio M. Lopez

PDF

Open Access

TL;DR

This paper introduces CIL++, an improved vision-based end-to-end driving model that uses high-resolution images and attention mechanisms, achieving competitive performance with less costly supervision.

Contribution

CIL++ enhances CILRS by processing higher-resolution images with a human-inspired HFOV and integrating a proper attention mechanism, serving as a strong, cost-effective baseline.

Findings

01

CIL++ achieves performance comparable to more expensive models.

02

Using high-resolution images and attention improves vision-based driving.

03

CIL++ is a cost-effective alternative for end-to-end driving models.

Abstract

On end-to-end driving, human driving demonstrations are used to train perception-based driving models by imitation learning. This process is supervised on vehicle signals (e.g., steering angle, acceleration) but does not require extra costly supervision (human labeling of sensor data). As a representative of such vision-based end-to-end driving models, CILRS is commonly used as a baseline to compare with new driving models. So far, some latest models achieve better performance than CILRS by using expensive sensor suites and/or by using large amounts of human-labeled data for training. Given the difference in performance, one may think that it is not worth pursuing vision-based pure end-to-end driving. However, we argue that this approach still has great value and potential considering cost and maintenance. In this paper, we present CIL++, which improves on CILRS by both processing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Advanced Neural Network Applications · Video Surveillance and Tracking Methods