Camera Perspective Transformation to Bird's Eye View via Spatial Transformer Model for Road Intersection Monitoring
Rukesh Prajapati, Amr S. El-Wakeel

TL;DR
This paper introduces a deep learning model that converts a single camera view of a road intersection into a bird's eye view, enabling real-world traffic monitoring with high accuracy, bridging the gap between simulation and practical deployment.
Contribution
The novel SDD-UNet model effectively transforms perspective images into BEV with reduced distortion and high accuracy, facilitating real-world traffic intersection management.
Findings
Achieves over 95% DSC, outperforming original UNet by 40%.
Maintains low MAE of 0.102 meters for vehicle position estimation.
Predicts vehicle masks with high spatial accuracy.
Abstract
Road intersection monitoring and control research often utilize bird's eye view (BEV) simulators. In real traffic settings, achieving a BEV akin to that in a simulator necessitates the deployment of drones or specific sensor mounting, which is neither feasible nor practical. Consequently, traffic intersection management remains confined to simulation environments given these constraints. In this paper, we address the gap between simulated environments and real-world implementation by introducing a novel deep-learning model that converts a single camera's perspective of a road intersection into a BEV. We created a simulation environment that closely resembles a real-world traffic junction. The proposed model transforms the vehicles into BEV images, facilitating road intersection monitoring and control model processing. Inspired by image transformation techniques, we propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation and Modeling Applications
