Object-Based Visual Camera Pose Estimation From Ellipsoidal Model and 3D-Aware Ellipse Prediction
Matthieu Zins, Gilles Simon, Marie-Odile Berger

TL;DR
This paper introduces a learning-based approach for initial camera pose estimation from a single image using 3D-aware ellipse prediction, improving accuracy without requiring detailed scene models or extensive ground truth data.
Contribution
It advances scene abstraction by detecting 3D-coherent ellipses, enhancing pose accuracy with minimal training data and no need for detailed scene models.
Findings
Significant increase in pose estimation accuracy.
Requires only a few hundred calibrated images for training.
Code and models are publicly available.
Abstract
In this paper, we propose a method for initial camera pose estimation from just a single image which is robust to viewing conditions and does not require a detailed model of the scene. This method meets the growing need of easy deployment of robotics or augmented reality applications in any environments, especially those for which no accurate 3D model nor huge amount of ground truth data are available. It exploits the ability of deep learning techniques to reliably detect objects regardless of viewing conditions. Previous works have also shown that abstracting the geometry of a scene of objects by an ellipsoid cloud allows to compute the camera pose accurately enough for various application needs. Though promising, these approaches use the ellipses fitted to the detection bounding boxes as an approximation of the imaged objects. In this paper, we go one step further and propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
