Semi-automatic 3D Object Keypoint Annotation and Detection for the Masses
Kenneth Blomqvist, Jen Jen Chung, Lionel Ott, Roland Siegwart

TL;DR
This paper introduces a semi-automatic toolkit for efficient 3D object keypoint annotation and detection, enabling rapid dataset creation and model training for robotics applications.
Contribution
The work presents a comprehensive toolkit that streamlines data collection, labeling, and learning for 3D object keypoints using a wrist-mounted camera on a robotic arm.
Findings
Achieved a functional 3D keypoint detector within a few hours.
Reduced manual effort in dataset annotation through semi-automatic methods.
Demonstrated the effectiveness of the toolkit in robotics scenarios.
Abstract
Creating computer vision datasets requires careful planning and lots of time and effort. In robotics research, we often have to use standardized objects, such as the YCB object set, for tasks such as object tracking, pose estimation, grasping and manipulation, as there are datasets and pre-learned methods available for these objects. This limits the impact of our research since learning-based computer vision methods can only be used in scenarios that are supported by existing datasets. In this work, we present a full object keypoint tracking toolkit, encompassing the entire process from data collection, labeling, model learning and evaluation. We present a semi-automatic way of collecting and labeling datasets using a wrist mounted camera on a standard robotic arm. Using our toolkit and method, we are able to obtain a working 3D object keypoint detector and go through the whole…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques
