Visuomotor Understanding for Representation Learning of Driving Scenes
Seokju Lee, Junsik Kim, Tae-Hyun Oh, Yongseop Jeong, Donggeun Yoo,, Stephen Lin, In So Kweon

TL;DR
This paper introduces a self-supervised learning approach using paired visual and sensor data from dashboard cameras to improve scene understanding in driving scenarios, outperforming existing unsupervised methods.
Contribution
It proposes a novel end-to-end framework that predicts dense optical flow from a single frame using paired sensing data, capturing semantic and geometric knowledge for driving scene representation.
Findings
Outperforms competing unsupervised representations on semantic segmentation
Learns semantic and geometric scene understanding from unlabeled data
Enhances downstream tasks requiring detailed scene comprehension
Abstract
Dashboard cameras capture a tremendous amount of driving scene video each day. These videos are purposefully coupled with vehicle sensing data, such as from the speedometer and inertial sensors, providing an additional sensing modality for free. In this work, we leverage the large-scale unlabeled yet naturally paired data for visual representation learning in the driving scenario. A representation is learned in an end-to-end self-supervised framework for predicting dense optical flow from a single frame with paired sensing data. We postulate that success on this task requires the network to learn semantic and geometric knowledge in the ego-centric view. For example, forecasting a future view to be seen from a moving vehicle requires an understanding of scene depth, scale, and movement of objects. We demonstrate that our learned representation can benefit other tasks that require…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Neural Network Applications · Remote Sensing and LiDAR Applications
