Understanding Bird's-Eye View of Road Semantics using an Onboard Camera
Yigit Baran Can, Alexander Liniger, Ozan Unal, Danda Paudel, Luc Van, Gool

TL;DR
This paper presents a novel method for online semantic bird's-eye view map estimation from a single onboard camera, enhancing scene understanding for autonomous vehicles by integrating image, BEV, and temporal data.
Contribution
It introduces a new architecture that combines image, BEV, and temporal understanding, significantly improving BEV scene understanding over existing methods.
Findings
Proposed architecture outperforms current state-of-the-art methods
Combining image, BEV, and temporal data is beneficial
Achieved significant accuracy improvements in BEV map estimation
Abstract
Autonomous navigation requires scene understanding of the action-space to move or anticipate events. For planner agents moving on the ground plane, such as autonomous vehicles, this translates to scene understanding in the bird's-eye view (BEV). However, the onboard cameras of autonomous cars are customarily mounted horizontally for a better view of the surrounding. In this work, we study scene understanding in the form of online estimation of semantic BEV maps using the video input from a single onboard camera. We study three key aspects of this task, image-level understanding, BEV level understanding, and the aggregation of temporal information. Based on these three pillars we propose a novel architecture that combines these three aspects. In our extensive experiments, we demonstrate that the considered aspects are complementary to each other for BEV understanding. Furthermore, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robotic Path Planning Algorithms · Advanced Image and Video Retrieval Techniques
