Mobile Robot Manipulation using Pure Object Detection
Brent Griffin

TL;DR
This paper presents a novel end-to-end mobile robot manipulation method based solely on object detection, utilizing a new few-shot detection approach that enables learning with minimal annotations and improves manipulation in cluttered environments.
Contribution
It introduces Task-focused Few-shot Object Detection (TFOD) for robotic manipulation, allowing robots to learn new objects with minimal data and automatically retrain detection models during operation.
Findings
Robot learns visual control from a single click of annotation.
Achieves state-of-the-art results on visual servo control and depth estimation benchmarks.
Successfully manipulates new objects in cluttered and mobile settings.
Abstract
This paper addresses the problem of mobile robot manipulation using object detection. Our approach uses detection and control as complimentary functions that learn from real-world interactions. We develop an end-to-end manipulation method based solely on detection and introduce Task-focused Few-shot Object Detection (TFOD) to learn new objects and settings. Our robot collects its own training data and automatically determines when to retrain detection to improve performance across various subtasks (e.g., grasping). Notably, detection training is low-cost, and our robot learns to manipulate new objects using as few as four clicks of annotation. In physical experiments, our robot learns visual control from a single click of annotation and a novel update formulation, manipulates new objects in clutter and other mobile settings, and achieves state-of-the-art results on an existing visual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Mobile Robot Manipulation using Pure Object Detection· youtube
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Robot Manipulation and Learning
