SkyEye: Self-Supervised Bird's-Eye-View Semantic Mapping Using Monocular Frontal View Images
Nikhil Gosala, K\"ursat Petek, Paulo L. J. Drews-Jr, Wolfram Burgard,, Abhinav Valada

TL;DR
SkyEye introduces a novel self-supervised method for generating Bird's-Eye-View semantic maps from monocular frontal images, reducing reliance on extensive annotated BEV data and achieving competitive results.
Contribution
It is the first self-supervised approach for BEV semantic mapping from monocular images, leveraging implicit and explicit supervision without extensive BEV annotations.
Findings
Performs on par with state-of-the-art fully supervised methods.
Achieves competitive results with only 1% of BEV supervision.
Successfully leverages FV semantic annotations and self-supervised depth estimates.
Abstract
Bird's-Eye-View (BEV) semantic maps have become an essential component of automated driving pipelines due to the rich representation they provide for decision-making tasks. However, existing approaches for generating these maps still follow a fully supervised training paradigm and hence rely on large amounts of annotated BEV data. In this work, we address this limitation by proposing the first self-supervised approach for generating a BEV semantic map using a single monocular image from the frontal view (FV). During training, we overcome the need for BEV ground truth annotations by leveraging the more easily available FV semantic annotations of video sequences. Thus, we propose the SkyEye architecture that learns based on two modes of self-supervision, namely, implicit supervision and explicit supervision. Implicit supervision trains the model by enforcing spatial consistency of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Vision and Imaging · Video Surveillance and Tracking Methods
