PhoCaL: A Multi-Modal Dataset for Category-Level Object Pose Estimation with Photometrically Challenging Objects
Pengyuan Wang, HyunJun Jung, Yitong Li, Siyuan Shen, Rahul, Parthasarathy Srikanth, Lorenzo Garattoni, Sven Meier, Nassir Navab, Benjamin, Busam

TL;DR
PhoCaL is a comprehensive multi-modal dataset designed for category-level object pose estimation, featuring photometrically challenging objects like reflective and transparent items, with high-precision annotations to advance robotic and AR applications.
Contribution
The paper introduces PhoCaL, a novel dataset with high-quality annotations for category-level pose estimation of challenging objects, supported by a new multi-modal data collection and annotation process.
Findings
State-of-the-art methods perform variably on PhoCaL
Photometrically challenging objects remain difficult for current algorithms
PhoCaL provides a new benchmark for future research
Abstract
Object pose estimation is crucial for robotic applications and augmented reality. Beyond instance level 6D object pose estimation methods, estimating category-level pose and shape has become a promising trend. As such, a new research field needs to be supported by well-designed datasets. To provide a benchmark with high-quality ground truth annotations to the community, we introduce a multimodal dataset for category-level object pose estimation with photometrically challenging objects termed PhoCaL. PhoCaL comprises 60 high quality 3D models of household objects over 8 categories including highly reflective, transparent and symmetric objects. We developed a novel robot-supported multi-modal (RGB, depth, polarisation) data acquisition and annotation process. It ensures sub-millimeter accuracy of the pose for opaque textured, shiny and transparent objects, no motion blur and perfect…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
