Deep Steering: Learning End-to-End Driving Model from Spatial and Temporal Visual Cues
Lu Chi, Yadong Mu

TL;DR
This paper presents a deep learning model for autonomous driving that directly maps camera images to steering angles, incorporating spatial and temporal cues with recurrent units, trained on real human driving data, and visualized for interpretability.
Contribution
The work introduces a novel end-to-end driving model using recurrent units to combine spatial and temporal information, trained on real human data, with enhanced interpretability through visualization.
Findings
Outperforms state-of-the-art models in steering prediction accuracy.
Demonstrates robustness under lighting changes and abrupt turns.
Utilizes real human driving data for training and evaluation.
Abstract
In recent years, autonomous driving algorithms using low-cost vehicle-mounted cameras have attracted increasing endeavors from both academia and industry. There are multiple fronts to these endeavors, including object detection on roads, 3-D reconstruction etc., but in this work we focus on a vision-based model that directly maps raw input images to steering angles using deep networks. This represents a nascent research topic in computer vision. The technical contributions of this work are three-fold. First, the model is learned and evaluated on real human driving videos that are time-synchronized with other vehicle sensors. This differs from many prior models trained from synthetic data in racing games. Second, state-of-the-art models, such as PilotNet, mostly predict the wheel angles independently on each video frame, which contradicts common understanding of driving as a stateful…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Advanced Neural Network Applications · Video Surveillance and Tracking Methods
MethodsInterpretability · Sigmoid Activation · Tanh Activation · Long Short-Term Memory
