TL;DR
This paper introduces a novel monocular method for localizing and estimating the shape of vehicles on steep and graded roads using a moving camera, improving accuracy on non-flat surfaces.
Contribution
It presents the first monocular approach to localize and shape-estimate vehicles on arbitrarily-shaped roads by modeling local ground planes and using semantic cues.
Findings
Significantly improves monocular vehicle localization on complex road surfaces.
Effective on both KITTI and SYNTHIA-SF benchmarks.
Outperforms previous state-of-the-art methods.
Abstract
Accurate localization of other traffic participants is a vital task in autonomous driving systems. State-of-the-art systems employ a combination of sensing modalities such as RGB cameras and LiDARs for localizing traffic participants, but most such demonstrations have been confined to plain roads. We demonstrate, to the best of our knowledge, the first results for monocular object localization and shape estimation on surfaces that do not share the same plane with the moving monocular camera. We approximate road surfaces by local planar patches and use semantic cues from vehicles in the scene to initialize a local bundle-adjustment like procedure that simultaneously estimates the pose and shape of the vehicles, and the orientation of the local ground plane on which the vehicle stands as well. We evaluate the proposed approach on the KITTI and SYNTHIA-SF benchmarks, for a variety of road…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
