TL;DR
This paper introduces the CoSTAR Block Stacking Dataset, a large, real-world dataset for training and evaluating neural network models on a block stacking task with workspace constraints, highlighting the limitations of existing models.
Contribution
The paper presents a new dataset with real-time, dynamic scenes for block stacking, and establishes a baseline with a novel neural architecture search method.
Findings
Existing neural networks do not generalize well to the new dataset.
A novel HyperTree MetaModel effectively predicts 3D poses for stacking.
The dataset enables more realistic training and evaluation of robotic grasping and stacking.
Abstract
A robot can now grasp an object more effectively than ever before, but once it has the object what happens next? We show that a mild relaxation of the task and workspace constraints implicit in existing object grasping datasets can cause neural network based grasping algorithms to fail on even a simple block stacking task when executed under more realistic circumstances. To address this, we introduce the JHU CoSTAR Block Stacking Dataset (BSD), where a robot interacts with 5.1 cm colored blocks to complete an order-fulfillment style block stacking task. It contains dynamic scenes and real time-series data in a less constrained environment than comparable datasets. There are nearly 12,000 stacking attempts and over 2 million frames of real data. We discuss the ways in which this dataset provides a valuable resource for a broad range of other topics of investigation. We find that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsHyperTree MetaModel
