ProgressLabeller: Visual Data Stream Annotation for Training Object-Centric 3D Perception
Xiaotong Chen, Huijie Zhang, Zeren Yu, Stanley Lewis, Odest Chadwicke, Jenkins

TL;DR
ProgressLabeller is a scalable, efficient tool for generating large 6D pose datasets from color images, supporting transparent objects, and improving robotic grasping performance through fine-tuning.
Contribution
It introduces ProgressLabeller, a novel method for rapid, scalable 6D pose data annotation from color images, including transparent objects, enhancing training efficiency.
Findings
Created over 1 million training samples rapidly.
Fine-tuning with ProgressLabeller improves robotic grasp success.
Supports transparent and translucent objects.
Abstract
Visual perception tasks often require vast amounts of labelled data, including 3D poses and image space segmentation masks. The process of creating such training data sets can prove difficult or time-intensive to scale up to efficacy for general use. Consider the task of pose estimation for rigid objects. Deep neural network based approaches have shown good performance when trained on large, public datasets. However, adapting these networks for other novel objects, or fine-tuning existing models for different environments, requires significant time investment to generate newly labelled instances. Towards this end, we propose ProgressLabeller as a method for more efficiently generating large amounts of 6D pose training data from color images sequences for custom scenes in a scalable manner. ProgressLabeller is intended to also support transparent or translucent objects, for which the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Advanced Neural Network Applications · Human Pose and Action Recognition
