The challenge of simultaneous object detection and pose estimation: a comparative study
Daniel O\~noro-Rubio, Roberto J. L\'opez-Sastre, Carolina, Redondo-Cabrera, Pedro Gil-Jim\'enez

TL;DR
This paper explores deep learning solutions for simultaneous object detection and pose estimation, proposing three novel architectures and analyzing classification versus regression approaches, achieving state-of-the-art results on benchmark datasets.
Contribution
It introduces three new deep learning architectures for joint detection and pose estimation, and provides a comprehensive comparative analysis of these methods and existing state-of-the-art approaches.
Findings
Achieved state-of-the-art performance on PASCAL3D+ and ObjectNet3D datasets.
Demonstrated the effectiveness of the proposed architectures in joint detection and pose estimation.
Provided insights into the classification versus regression approaches for pose estimation.
Abstract
Detecting objects and estimating their pose remains as one of the major challenges of the computer vision research community. There exists a compromise between localizing the objects and estimating their viewpoints. The detector ideally needs to be view-invariant, while the pose estimation process should be able to generalize towards the category-level. This work is an exploration of using deep learning models for solving both problems simultaneously. For doing so, we propose three novel deep learning architectures, which are able to perform a joint detection and pose estimation, where we gradually decouple the two tasks. We also investigate whether the pose estimation problem should be solved as a classification or regression problem, being this still an open question in the computer vision community. We detail a comparative analysis of all our solutions and the methods that currently…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
