UrbanNet: Leveraging Urban Maps for Long Range 3D Object Detection
Juan Carrillo, Steven Waslander

TL;DR
UrbanNet is a modular system that uses urban maps and monocular images to achieve accurate long-range 3D object detection from static cameras, addressing challenges in traffic monitoring and changing terrains.
Contribution
It introduces a novel architecture combining urban maps, 2D detectors, and 3D descriptors for improved long-range monocular 3D detection in complex urban environments.
Findings
Effective long-range detection on synthetic datasets
Advantages in traffic detection with variable terrain slopes
Robustness to object rotation along all axes
Abstract
Relying on monocular image data for precise 3D object detection remains an open problem, whose solution has broad implications for cost-sensitive applications such as traffic monitoring. We present UrbanNet, a modular architecture for long range monocular 3D object detection with static cameras. Our proposed system combines commonly available urban maps along with a mature 2D object detector and an efficient 3D object descriptor to accomplish accurate detection at long range even when objects are rotated along any of their three axes. We evaluate UrbanNet on a novel challenging synthetic dataset and highlight the advantages of its design for traffic detection in roads with changing slope, where the flat ground approximation does not hold. Data and code are available at https://github.com/TRAILab/UrbanNet
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques
