StereOBJ-1M: Large-scale Stereo Image Dataset for 6D Object Pose Estimation
Xingyu Liu, Shun Iwase, Kris M. Kitani

TL;DR
StereOBJ-1M is a large-scale stereo image dataset with over 393,000 frames, designed to improve 6D object pose estimation, especially for challenging cases like transparency, translucency, and reflection, supporting deep learning research.
Contribution
The paper introduces a novel large-scale stereo dataset with comprehensive annotations and a new pose annotation method for complex environments, advancing 6D pose estimation research.
Findings
Benchmark results for state-of-the-art pose estimators on StereOBJ-1M
Proposed pose optimization method improves accuracy from keypoints
Dataset enables research on transparent and reflective objects
Abstract
We present a large-scale stereo RGB image object pose estimation dataset named the dataset. The dataset is designed to address challenging cases such as object transparency, translucency, and specular reflection, in addition to the common challenges of occlusion, symmetry, and variations in illumination and environments. In order to collect data of sufficient scale for modern deep learning models, we propose a novel method for efficiently annotating pose data in a multi-view fashion that allows data capturing in complex and flexible environments. Fully annotated with 6D object poses, our dataset contains over 393K frames and over 1.5M annotations of 18 objects recorded in 182 scenes constructed in 11 different environments. The 18 objects include 8 symmetric objects, 7 transparent objects, and 8 reflective objects. We benchmark two state-of-the-art pose estimation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Robotics and Sensor-Based Localization · Image and Object Detection Techniques
