RoCap: A Robotic Data Collection Pipeline for the Pose Estimation of   Appearance-Changing Objects

Jiahao Nick Li; Toby Chong; Zhongyi Zhou; Hironori Yoshida; Koji; Yatani; Xiang 'Anthony' Chen; Takeo Igarashi

arXiv:2407.08081·cs.RO·July 12, 2024

RoCap: A Robotic Data Collection Pipeline for the Pose Estimation of Appearance-Changing Objects

Jiahao Nick Li, Toby Chong, Zhongyi Zhou, Hironori Yoshida, Koji, Yatani, Xiang 'Anthony' Chen, Takeo Igarashi

PDF

Open Access

TL;DR

Rocap is a robotic data collection pipeline that captures images of objects in various poses to improve pose estimation for appearance-changing objects, outperforming traditional synthetic data methods.

Contribution

The paper introduces Rocap, a novel robotic pipeline for generating labeled data for pose estimation of appearance-changing objects, addressing limitations of static object methods.

Findings

01

Rocap effectively captures diverse object poses with ground truth labels.

02

Models trained on Rocap data outperform synthetic data-based models.

03

The approach improves pose estimation accuracy for deformable, transparent, reflective, and articulated objects.

Abstract

Object pose estimation plays a vital role in mixed-reality interactions when users manipulate tangible objects as controllers. Traditional vision-based object pose estimation methods leverage 3D reconstruction to synthesize training data. However, these methods are designed for static objects with diffuse colors and do not work well for objects that change their appearance during manipulation, such as deformable objects like plush toys, transparent objects like chemical flasks, reflective objects like metal pitchers, and articulated objects like scissors. To address this limitation, we propose Rocap, a robotic pipeline that emulates human manipulation of target objects while generating data labeled with ground truth pose information. The user first gives the target object to a robotic arm, and the system captures many pictures of the object in various 6D configurations. The system…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Robotics and Sensor-Based Localization