Category-level Object Detection, Pose Estimation and Reconstruction from Stereo Images
Chuanrui Zhang, Yonggen Ling, Minglei Lu, Minghan Qin, Haoqian Wang

TL;DR
This paper introduces CODERS, a one-stage stereo-based method for category-level object detection, pose estimation, and reconstruction that outperforms existing approaches and generalizes well from simulation to real-world robot manipulation.
Contribution
The paper presents a novel end-to-end stereo matching approach for multiple object understanding tasks, improving accuracy and generalization in robotic manipulation.
Findings
Outperforms all competing methods on the TOD dataset
Generalizes well from simulated to real-world data
Enables end-to-end learning of detection, pose, and reconstruction
Abstract
We study the 3D object understanding task for manipulating everyday objects with different material properties (diffuse, specular, transparent and mixed). Existing monocular and RGB-D methods suffer from scale ambiguity due to missing or imprecise depth measurements. We present CODERS, a one-stage approach for Category-level Object Detection, pose Estimation and Reconstruction from Stereo images. The base of our pipeline is an implicit stereo matching module that combines stereo image features with 3D position information. Concatenating this presented module and the following transform-decoder architecture leads to end-to-end learning of multiple tasks required by robot manipulation. Our approach significantly outperforms all competing methods in the public TOD dataset. Furthermore, trained on simulated data, CODERS generalize well to unseen category-level object instances in real-world…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Vision and Imaging · Robotics and Sensor-Based Localization
MethodsBalanced Selection
