Periphery-Fovea Multi-Resolution Driving Model guided by Human Attention

Ye Xia; Jinkyu Kim; John Canny; Karl Zipser; and David Whitney

arXiv:1903.09950·cs.CV·March 26, 2019·5 cites

Periphery-Fovea Multi-Resolution Driving Model guided by Human Attention

Ye Xia, Jinkyu Kim, John Canny, Karl Zipser, and David Whitney

PDF

Open Access 1 Repo

TL;DR

This paper introduces a human attention-guided multi-resolution driving model that uses peripheral low-res and foveal high-res inputs to improve vehicle speed prediction, especially in critical pedestrian scenarios.

Contribution

It presents a novel periphery-fovea multi-resolution model guided by driver gaze, enhancing driving accuracy and critical situation performance over uni-resolution models.

Findings

01

High-resolution gaze-guided input improves driving accuracy.

02

Model performs better in pedestrian-critical situations.

03

Outperforms uni-resolution models with same computational cost.

Abstract

Inspired by human vision, we propose a new periphery-fovea multi-resolution driving model that predicts vehicle speed from dash camera videos. The peripheral vision module of the model processes the full video frames in low resolution. Its foveal vision module selects sub-regions and uses high-resolution input from those regions to improve its driving performance. We train the fovea selection module with supervision from driver gaze. We show that adding high-resolution input from predicted human driver gaze locations significantly improves the driving accuracy of the model. Our periphery-fovea multi-resolution model outperforms a uni-resolution periphery-only model that has the same amount of floating-point operations. More importantly, we demonstrate that our driving model achieves a significantly higher performance gain in pedestrian-involved critical situations than in other…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pascalxia/periphery_fovea_driving
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Autonomous Vehicle Technology and Safety

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings