Using Segmentation Masks in the ICCV 2019 Learning to Drive Challenge

Antonia Lovjer; Minsu Yeom; Benedikt D. Schifferer; Iddo Drori

arXiv:1910.10317·cs.CV·October 24, 2019·1 cites

Using Segmentation Masks in the ICCV 2019 Learning to Drive Challenge

Antonia Lovjer, Minsu Yeom, Benedikt D. Schifferer, Iddo Drori

PDF

Open Access 1 Repo

TL;DR

This paper improves autonomous driving predictions by integrating segmentation masks from a pre-trained network into multiple neural network models, achieving second place in the ICCV 2019 challenge.

Contribution

It introduces the use of external segmentation masks and ensemble models for vehicle speed and steering angle prediction in driving scenarios.

Findings

01

Achieved second best in ICCV 2019 challenge

02

Ensemble of diverse models improves performance

03

Segmentation masks enhance prediction accuracy

Abstract

In this work we predict vehicle speed and steering angle given camera image frames. Our key contribution is using an external pre-trained neural network for segmentation. We augment the raw images with their segmentation masks and mirror images. We ensemble three diverse neural network models (i) a CNN using a single image and its segmentation mask, (ii) a stacked CNN taking as input a sequence of images and segmentation masks, and (iii) a bidirectional GRU, extracting image features using a pre-trained ResNet34, DenseNet121 and our own CNN single image model. We achieve the second best performance for MSE angle and second best performance overall, to win 2nd place in the ICCV Learning to Drive challenge. We make our models and code publicly available.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

AntoniaLovjer/learntodrive
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Anomaly Detection Techniques and Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Gated Recurrent Unit