A Visual Navigation Perspective for Category-Level Object Pose   Estimation

Jiaxin Guo; Fangxun Zhong; Rong Xiong; Yunhui Liu; Yue Wang; Yiyi Liao

arXiv:2203.13572·cs.CV·July 26, 2022

A Visual Navigation Perspective for Category-Level Object Pose Estimation

Jiaxin Guo, Fangxun Zhong, Rong Xiong, Yunhui Liu, Yue Wang, Yiyi Liao

PDF

Open Access 1 Repo

TL;DR

This paper explores how visual navigation strategies can improve category-level object pose estimation using generative models, focusing on inference efficiency, robustness, and convergence.

Contribution

It investigates navigation policies for analysis-by-synthesis pose estimation and introduces a hybrid approach that outperforms existing strategies.

Findings

01

Hybrid navigation approach improves convergence and robustness.

02

Evaluation shows superior performance over state-of-the-art methods.

03

Analysis of different navigation strategies informs better inference in pose estimation.

Abstract

This paper studies category-level object pose estimation based on a single monocular image. Recent advances in pose-aware generative models have paved the way for addressing this challenging task using analysis-by-synthesis. The idea is to sequentially update a set of latent variables, e.g., pose, shape, and appearance, of the generative model until the generated image best agrees with the observation. However, convergence and efficiency are two challenges of this inference procedure. In this paper, we take a deeper look at the inference of analysis-by-synthesis from the perspective of visual navigation, and investigate what is a good navigation policy for this specific task. We evaluate three different strategies, including gradient descent, reinforcement learning and imitation learning, via thorough comparisons in terms of convergence, robustness and efficiency. Moreover, we show that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wrld/visual_navigation_pose_estimation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Multimodal Machine Learning Applications · Human Pose and Action Recognition