Monocular BEV Perception of Road Scenes via Front-to-Top View Projection
Wenxi Liu, Qi Li, Weixiang Yang, Jiaxin Cai, Yuanlong Yu, Yuexin Ma,, Shengfeng He, Jia Pan

TL;DR
This paper introduces a novel monocular camera-based framework for reconstructing bird's-eye view maps of road scenes, improving accuracy and efficiency over existing methods by using a front-to-top view projection with cycle consistency and multi-scale features.
Contribution
The paper presents a new front-to-top view projection module with cycle consistency and multi-scale features for monocular BEV perception, achieving state-of-the-art results and real-time performance.
Findings
State-of-the-art performance in road layout and vehicle occupancy estimation
Outperforms competitors in multi-class semantic estimation
Runs at 25 FPS on a single GPU
Abstract
HD map reconstruction is crucial for autonomous driving. LiDAR-based methods are limited due to expensive sensors and time-consuming computation. Camera-based methods usually need to perform road segmentation and view transformation separately, which often causes distortion and missing content. To push the limits of the technology, we present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view given a front-view monocular image only. We propose a front-to-top view projection (FTVP) module, which takes the constraint of cycle consistency between views into account and makes full use of their correlation to strengthen the view transformation and scene understanding. In addition, we also apply multi-scale FTVP modules to propagate the rich spatial information of low-level features to mitigate spatial deviation of the predicted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Advanced Neural Network Applications
