Object Pose Estimation from Monocular Image using Multi-View Keypoint Correspondence
Jogendra Nath Kundu, Rahul M. V., Aditya Ganeshan, and R. Venkatesh, Babu

TL;DR
This paper introduces a data-efficient, multi-view keypoint correspondence method for object pose estimation from monocular images, leveraging geometric regularity and pose-invariant descriptors to improve accuracy and reduce data dependency.
Contribution
The work presents a novel multi-view correspondence framework that enhances pose estimation accuracy and addresses data scarcity by learning pose-invariant local descriptors from simple RGB images.
Findings
Achieves state-of-the-art results on Pascal3D+ and ObjectNet3D datasets.
Effectively alleviates data scarcity issues in pose estimation.
Multi-view fusion significantly improves geometric understanding.
Abstract
Understanding the geometry and pose of objects in 2D images is a fundamental necessity for a wide range of real world applications. Driven by deep neural networks, recent methods have brought significant improvements to object pose estimation. However, they suffer due to scarcity of keypoint/pose-annotated real images and hence can not exploit the object's 3D structural information effectively. In this work, we propose a data-efficient method which utilizes the geometric regularity of intraclass objects for pose estimation. First, we learn pose-invariant local descriptors of object parts from simple 2D RGB images. These descriptors, along with keypoints obtained from renders of a fixed 3D template model are then used to generate keypoint correspondence maps for a given monocular real image. Finally, a pose estimation network predicts 3D pose of the object using these correspondence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques · Advanced Neural Network Applications
