Vehicle Pose and Shape Estimation through Multiple Monocular Vision
Wenhao Ding, Shuaijun Li, Guilin Zhang, Xiangyu Lei, Huihuan Qian

TL;DR
This paper introduces a novel monocular vision-based method combining CNNs and optimization techniques to accurately estimate vehicle pose and shape from multiview images with minimal overlap, enhancing surveillance and transportation systems.
Contribution
The paper proposes a new approach integrating CNN-based keypoint detection with Cross Projection Optimization and Hierarchical Wireframe Constraint for improved 3D vehicle pose and shape estimation.
Findings
Outperforms existing monocular and stereo methods in accuracy
Effective in both simulated and real-world scenes
Provides a robust solution for vehicle localization and tracking
Abstract
In this paper, we present an accurate approach to estimate vehicles' pose and shape from off-board multiview images. The images are taken by monocular cameras and have small overlaps. We utilize state-of-the-art convolutional neural networks (CNNs) to extract vehicles' semantic keypoints and introduce a Cross Projection Optimization (CPO) method to estimate the 3D pose. During the iterative CPO process, an adaptive shape adjustment method named Hierarchical Wireframe Constraint (HWC) is implemented to estimate the shape. Our approach is evaluated under both simulated and real-world scenes for performance verification. It's shown that our algorithm outperforms other existing monocular and stereo methods for vehicles' pose and shape estimation. This approach provides a new and robust solution for off-board visual vehicle localization and tracking, which can be applied to massive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Video Surveillance and Tracking Methods
