Estimation of Appearance and Occupancy Information in Birds Eye View from Surround Monocular Images
Sarthak Sharma, Unnikrishnan R. Nair, Udit Singh Parihar, Midhun Menon, S, Srikanth Vidapanakal

TL;DR
This paper introduces a novel method to generate a comprehensive Birds-eye View (BEV) that includes appearance and occupancy details of traffic participants from multiple monocular cameras, enhancing scene understanding for autonomous driving.
Contribution
It proposes a new approach to produce a detailed BEV with appearance information using learned image embeddings from surround monocular cameras, improving scene interpretation.
Findings
Effective scene representation with appearance and occupancy details
Improved downstream task performance such as object tracking
Validated on synthetic CARLA dataset
Abstract
Autonomous driving requires efficient reasoning about the location and appearance of the different agents in the scene, which aids in downstream tasks such as object detection, object tracking, and path planning. The past few years have witnessed a surge in approaches that combine the different taskbased modules of the classic self-driving stack into an End-toEnd(E2E) trainable learning system. These approaches replace perception, prediction, and sensor fusion modules with a single contiguous module with shared latent space embedding, from which one extracts a human-interpretable representation of the scene. One of the most popular representations is the Birds-eye View (BEV), which expresses the location of different traffic participants in the ego vehicle frame from a top-down view. However, a BEV does not capture the chromatic appearance information of the participants. To overcome…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Brain Tumor Detection and Classification
MethodsEntropy Regularization · Proximal Policy Optimization · Test · CARLA: An Open Urban Driving Simulator
