PhoCaL: A Multi-Modal Dataset for Category-Level Object Pose Estimation   with Photometrically Challenging Objects

Pengyuan Wang; HyunJun Jung; Yitong Li; Siyuan Shen; Rahul; Parthasarathy Srikanth; Lorenzo Garattoni; Sven Meier; Nassir Navab; Benjamin; Busam

arXiv:2205.08811·cs.CV·May 19, 2022

PhoCaL: A Multi-Modal Dataset for Category-Level Object Pose Estimation with Photometrically Challenging Objects

Pengyuan Wang, HyunJun Jung, Yitong Li, Siyuan Shen, Rahul, Parthasarathy Srikanth, Lorenzo Garattoni, Sven Meier, Nassir Navab, Benjamin, Busam

PDF

Open Access

TL;DR

PhoCaL is a comprehensive multi-modal dataset designed for category-level object pose estimation, featuring photometrically challenging objects like reflective and transparent items, with high-precision annotations to advance robotic and AR applications.

Contribution

The paper introduces PhoCaL, a novel dataset with high-quality annotations for category-level pose estimation of challenging objects, supported by a new multi-modal data collection and annotation process.

Findings

01

State-of-the-art methods perform variably on PhoCaL

02

Photometrically challenging objects remain difficult for current algorithms

03

PhoCaL provides a new benchmark for future research

Abstract

Object pose estimation is crucial for robotic applications and augmented reality. Beyond instance level 6D object pose estimation methods, estimating category-level pose and shape has become a promising trend. As such, a new research field needs to be supported by well-designed datasets. To provide a benchmark with high-quality ground truth annotations to the community, we introduce a multimodal dataset for category-level object pose estimation with photometrically challenging objects termed PhoCaL. PhoCaL comprises 60 high quality 3D models of household objects over 8 categories including highly reflective, transparent and symmetric objects. We developed a novel robot-supported multi-modal (RGB, depth, polarisation) data acquisition and annotation process. It ensures sub-millimeter accuracy of the pose for opaque textured, shiny and transparent objects, no motion blur and perfect…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques