Object Pose Estimation from Monocular Image using Multi-View Keypoint   Correspondence

Jogendra Nath Kundu; Rahul M. V.; Aditya Ganeshan; and R. Venkatesh; Babu

arXiv:1809.00553·cs.CV·September 5, 2018

Object Pose Estimation from Monocular Image using Multi-View Keypoint Correspondence

Jogendra Nath Kundu, Rahul M. V., Aditya Ganeshan, and R. Venkatesh, Babu

PDF

Open Access 2 Repos

TL;DR

This paper introduces a data-efficient, multi-view keypoint correspondence method for object pose estimation from monocular images, leveraging geometric regularity and pose-invariant descriptors to improve accuracy and reduce data dependency.

Contribution

The work presents a novel multi-view correspondence framework that enhances pose estimation accuracy and addresses data scarcity by learning pose-invariant local descriptors from simple RGB images.

Findings

01

Achieves state-of-the-art results on Pascal3D+ and ObjectNet3D datasets.

02

Effectively alleviates data scarcity issues in pose estimation.

03

Multi-view fusion significantly improves geometric understanding.

Abstract

Understanding the geometry and pose of objects in 2D images is a fundamental necessity for a wide range of real world applications. Driven by deep neural networks, recent methods have brought significant improvements to object pose estimation. However, they suffer due to scarcity of keypoint/pose-annotated real images and hence can not exploit the object's 3D structural information effectively. In this work, we propose a data-efficient method which utilizes the geometric regularity of intraclass objects for pose estimation. First, we learn pose-invariant local descriptors of object parts from simple 2D RGB images. These descriptors, along with keypoints obtained from renders of a fixed 3D template model are then used to generate keypoint correspondence maps for a given monocular real image. Finally, a pose estimation network predicts 3D pose of the object using these correspondence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques · Advanced Neural Network Applications